Evolving new molecular function

ABSTRACT

Nature evolves biological molecules such as proteins through iterated rounds of diversification, selection, and amplification. The present invention provides methods, compositions, and systems for synthesizing, selecting, amplifying, and evolving non-natural molecules based on nucleic acid templates. The sequence of a nucleic acid template is used to direct the synthesis of non-natural molecules such as unnatural polymers and small molecules. Using this method combinatorial libraries of these molecules can be prepared and screened. Upon selection of a molecule, its encoding nucleic acid template may be amplified and/or evolved to yield the same molecule or related molecules for re-screening. The inventive methods and compositions of the present invention allow for the amplification and evolution of non-natural molecules in a manner analogous to the amplification of natural biopolymer such as polynucleotides and protein.

PRIORITY INFORMATION

This application claims priority under 35 U.S.C. §119(e) to U.S.Provisional patent applications 60/277,081, filed Mar. 19, 2001,entitled “Nucleic Acid Directed Synthesis of Chemical Compounds”;60/277,094, filed Mar. 19, 2001, entitled “Approaches to Generating NewMolecular Function”; and 60/306,691, filed Jul. 20, 2001, entitled“Approaches to Generating New Molecular Function”, and the entirecontents of each of these applications are hereby incorporated byreference.

BACKGROUND OF THE INVENTION

The classic “chemical approach” to generating molecules with newfunctions has been used extensively over the last century inapplications ranging from drug discovery to synthetic methodology tomaterials science. In this approach (FIG. 1, black), researcherssynthesize or isolate candidate molecules, assay these candidates fordesired properties, determine the structures of active compounds ifunknown, formulate structure-activity relationships based on the assayand structural data, and then synthesize a new generation of moleculesdesigned to possess improved properties. While combinatorial chemistrymethods (see, for example, A. V. Eliseev and J. M. Lehn. CombinatorialChemistry In Biology 1999, 243, 159-172; K. W. Kuntz, M. L. Snapper andA. H. Hoveyda. Current Opinion in Chemical Biology 1999, 3, 313-319; D.R. Liu and P. G. Schultz. Angew. Chem. Intl. Ed. Eng. 1999, 38, 36) haveincreased the throughput of this approach, its fundamental limitationsremain unchanged. Several factors limit the effectiveness of thechemical approach to generating molecular function. First, our abilityto accurately predict the structural changes that will lead to newfunction is often inadequate due to subtle conformational rearrangementsof molecules, unforeseen solvent interactions, or unknown stereochemicalrequirements of binding or reaction events. The resulting complexity ofstructure-activity relationships frequently limits the success ofrational ligand or catalyst design, including those efforts conducted ina high-throughput manner. Second, the need to assay or screen, ratherthan select, each member of a collection of candidates limits the numberof molecules that can be searched in each experiment. Finally, the lackof a way to amplify synthetic molecules places requirements on theminimum amount of material that must be produced for characterization,screening, and structure elucidation. As a result, it can be difficultto generate libraries of more than roughly 10⁶ different syntheticcompounds.

In contrast, Nature generates proteins with new functions using afundamentally different method that overcomes many of these limitations.In this approach (FIG. 1, gray), a protein with desired propertiesinduces the survival and amplification of the information encoding thatprotein. This information is diversified through spontaneous mutationand DNA recombination, and then translated into a new generation ofcandidate proteins using the ribosome. The power of this process is wellappreciated (see, F. Arnold Acc. Chem. Res. 1998, 31, 125; F. H. Arnoldet al. Curr. Opin. Chem. Biol. 1999, 3, 54-59; J. Minshull et al. Curr.Opin. Chem. Biol. 1999, 3, 284-90) and is evidenced by the fact thatproteins and nucleic acids dominate the solutions to many complexchemical problems despite their limited chemical functionality. Clearly,unlike the linear chemical approach described above, the steps used byNature form a cycle of molecular evolution. Proteins emerging from thisprocess have been directly selected, rather than simply screened, fordesired activities. Because the information encoding evolving proteins(DNA) can be amplified, a single protein molecule with desired activitycan in theory lead to the survival and propagation of the DNA encodingits structure. The vanishingly small amounts of material needed toparticipate in a cycle of molecular evolution allow libraries muchlarger in diversity than those synthesized by chemical approaches to begenerated and selected for desired function in small volumes.

Acknowledging the power and efficiency of Nature's approach, researchershave used molecular evolution to generate many proteins and nucleicacids with novel binding or catalytic properties (see, for example, J.Minshull et al. Curr. Opin. Chem. Biol. 1999, 3, 284-90; C.Schmidt-Dannert et al. Trends Biotechnol. 1999, 17, 135-6; D. S. Wilsonet al. Annu. Rev. Biochem. 1999, 68, 611-47). Proteins and nucleic acidsevolved by researchers have demonstrated value as research tools,diagnostics, industrial reagents, and therapeutics and have greatlyexpanded our understanding of the molecular interactions that endowproteins and nucleic acids with binding or catalytic properties (see, M.Famulok et al. Curr. Opin. Chem. Biol. 1998, 2, 320-7).

Despite nature's efficient approach to generating function, nature'smolecular evolution is limited to two types of “natural”molecules—proteins and nucleic acids—because thus far the information inDNA can only be translated into proteins or into other nucleic acids.However, many synthetic molecules of interest do not in generalrepresent nucleic acid backbones, and the use of DNA-templated synthesisto translate DNA sequences into synthetic small molecules would bebroadly useful only if synthetic molecules other than nucleic acids andnucleic acid analogs could be synthesized in a DNA-templated fashion. Anideal approach to generating functional molecules would merge the mostpowerful aspects of molecular evolution with the flexibility ofsynthetic chemistry. Clearly, enabling the evolution of non-naturalsynthetic small molecules and polymers, similarly to the way natureevolves biomolecules, would lead to much more effective methods ofdiscovering new synthetic ligands, receptors, and catalysts difficult orimpossible to generate using rational design.

SUMMARY OF THE INVENTION

The recognition of the need to be able to amplify and evolve classes ofmolecules besides nucleic acids and proteins led to the presentinvention providing methods and compositions for the template-directedsynthesis, amplification, and evolution of molecules. In general, thesemethods use an evolvable template to direct the synthesis of a chemicalcompound or library of chemical compounds (i.e., the template actuallyencodes the synthesis of a chemical compound). Based on a libraryencoded and synthesized using a template such as a nucleic acid, methodsare provided for amplifying, evolving, and screening the library. Incertain embodiments of special interest, the chemical compounds arecompounds that are not, or do not resemble, nucleic acids or analogsthereof. In certain embodiments, the chemical compounds of thesetemplate-encoded combinatorial libraries are polymers and morepreferably are unnatural polymers (i.e., excluding natural peptides,proteins, and polynucleotides). In other embodiments, the chemicalcompounds are small molecules.

In certain embodiments, the method of synthesizing a compound or libraryof compounds comprises first providing one or more nucleic acidtemplates, which one or more nucleic acid templates optionally have areactive unit associated therewith. The nucleic acid template is thencontacted with one or more transfer units designed to have a firstmoiety, an anti-codon, which hybridizes to a sequence of the nucleicacid, and is associated with a second moiety, a reactive unit, whichincludes a building block of the compound to be synthesized. Once thesetransfer units have hybridized to the nucleic acid template in asequence-specific manner, the synthesis of the chemical compound cantake place due to the interaction of reactive moieties present on thetransfer units and/or the nucleic acid template. Significantly, thesequence of the nucleic acid can later be determined to decode thesynthetic history of the attached compound and thereby its structure. Itwill be appreciated that the method described herein may be used tosynthesize one molecule at a time or may be used to synthesize thousandsto millions of compounds using combinatorial methods.

It will be appreciated that libraries synthesized in this manner (i.e.,having been encoded by a nucleic acid) have the advantage of beingamplifiable and evolvable. Once a molecule is identified, its nucleicacid template besides acting as a tag used to identify the attachedcompound can also be amplified using standard DNA techniques such as thepolymerase chain reaction (PCR). The amplified nucleic acid can then beused to synthesize more of the desired compound. In certain embodiments,during the amplification step mutations are introduced into the nucleicacid in order to generate a population of chemical compounds that arerelated to the parent compound but are modified at one or more sites.The mutated nucleic acids can then be used to synthesize a new libraryof related compounds. In this way, the library being screened can beevolved to contain more compounds with the desired activity or tocontain compounds with a higher degree of activity.

The methods of the present invention may be used to synthesize a widevariety of chemical compounds. In certain embodiments, the methods areused to synthesize and evolve unnatural polymers (i.e., excludingpolynucleotides and peptides), which cannot be amplified and evolvedusing standard techniques currently available. In certain otherembodiments, the inventive methods and compositions are utilized for thesynthesis of small molecules that are not typically polymeric. In stillother embodiments, the method is utilized for the generation ofnon-natural nucleic acid polymers.

The present invention also provides the transfer molecules (e.g.,nucleic acid templates and/or transfer units) useful in the practice ofthe inventive methods. These transfer molecules typically include aportion capable of hybridizing to a sequence of nucleic acid and asecond portion with monomers, other building blocks, or reactants to beincorporated into the final compound being synthesized. It will beappreciated that the two portions of the transfer molecule arepreferably associated with each other either directly or through alinker moiety. It will also be appreciated that the reactive unit andthe anti-codon may be present in the same molecule (e.g., a non-naturalnucleotide having functionality incorporated therein).

The present invention also provides kits and compositions useful in thepractice of the inventive methods. These kits may include nucleic acidtemplates, transfer molecules, monomers, solvents, buffers, enzymes,reagents for PCR, nucleotides, small molecule scaffolds, etc. The kitmay be used in the synthesis of a particular type of unnatural polymeror small molecule.

DEFINITIONS

The term antibody refers to an immunoglobulin, whether natural or whollyor partially synthetically produced. All derivatives thereof whichmaintain specific binding ability are also included in the term. Theterm also covers any protein having a binding domain which is homologousor largely homologous to an immunoglobulin binding domain. Theseproteins may be derived from natural sources, or partly or whollysynthetically produced. An antibody may be monoclonal or polyclonal. Theantibody may be a member of any immunoglobulin class, including any ofthe human classes: IgG, IgM, IgA, IgD, and IgE. Derivatives of the IgGclass, however, are preferred in the present invention.

The term, associated with, is used to describe the interaction betweenor among two or more groups, moieties, compounds, monomers, etc. Whentwo or more entities are “associated with” one another as describedherein, they are linked by a direct or indirect covalent or non-covalentinteraction. Preferably, the association is covalent. The covalentassociation may be through an amide, ester, carbon-carbon, disulfide,carbamate, ether, or carbonate linkage. The covalent association mayalso include a linker moiety such as a photocleavable linker. Desirablenon-covalent interactions include hydrogen bonding, van der Waalsinteractions, hydrophobic interactions, magnetic interactions,electrostatic interactions, etc. Also, two or more entities or agentsmay be “associated” with one another by being present together in thesame composition.

A biological macromolecule is a polynucleotide (e.g., RNA, DNA, RNA/DNAhybrid), protein, peptide, lipid, natural product, or polysaccharide.The biological macromolecule may be naturally occurring or non-naturallyoccurring. In a preferred embodiment, a biological macromolecule has amolecular weight greater than 500 g/mol.

Polynucleotide, nucleic acid, or oligonucleotide refers to a polymer ofnucleotides. The polymer may include natural nucleosides (i.e.,adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine,deoxythymidine, deoxyguanosine, and deoxycytidine), nucleoside analogs(e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine,3-methyl adenosine, 5-methylcytidine, C5-bromouridine, C5-fluorouridine,C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine,C5-methylcytidine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine,8-oxoguanosine, O(6)-methyl guanine, and 2-thiocytidine), chemicallymodified bases, biologically modified bases (e.g., methylated bases),intercalated bases, modified sugars (e.g., 2′-fluororibose, ribose,2′-deoxyribose, arabinose, and hexose), or modified phosphate groups(e.g., phosphorothioates and 5′-N-phosphoramidite linkages).

A protein comprises a polymer of amino acid residues linked together bypeptide bonds. The term, as used herein, refers to proteins,polypeptides, and peptide of any size, structure, or function.Typically, a protein will be at least three amino acids long. A proteinmay refer to an individual protein or a collection of proteins. Aprotein may refer to a full-length protein or a fragment of a protein.Inventive proteins preferably contain only natural amino acids, althoughnon-natural amino acids (i.e., compounds that do not occur in nature butthat can be incorporated into a polypeptide chain; see, for example,http://www.cco.caltech.edu/˜dadgrp/Unnatstruct.gif, which displaysstructures of non-natural amino acids that have been successfullyincorporated into functional ion channels) and/or amino acid analogs asare known in the art may alternatively be employed. Also, one or more ofthe amino acids in an inventive protein may be modified, for example, bythe addition of a chemical entity such as a carbohydrate group, ahydroxyl group, a phosphate group, a farnesyl group, an isofarnesylgroup, a fatty acid group, a linker for conjugation, functionalization,or other modification, etc. A protein may also be a single molecule ormay be a multi-molecular complex. A protein may be just a fragment of anaturally occurring protein or peptide. A protein may be naturallyoccurring, recombinant, or synthetic, or any combination of these.

The term small molecule, as used herein, refers to a non-peptidic,non-oligomeric organic compound either synthesized in the laboratory orfound in nature. Small molecules, as used herein, can refer to compoundsthat are “natural product-like”, however, the term “small molecule” isnot limited to “natural product-like” compounds. Rather, a smallmolecule is typically characterized in that it possesses one or more ofthe following characteristics including having several carbon-carbonbonds, having multiple stereocenters, having multiple functional groups,having at least two different types of functional groups, and having amolecular weight of less than 1500, although this characterization isnot intended to be limiting for the purposes of the present invention.

The term small molecule scaffold, as used herein, refers to a chemicalcompound having at least one site for functionalization. In a preferredembodiment, the small molecule scaffold may have a multitude of sitesfor functionalization. These functionalization sites may be protected ormasked as would be appreciated by one of skill in this art. The sitesmay also be found on an underlying ring structure or backbone.

The term transfer unit, as used herein, refers to a molecule comprisingan anti-codon moiety associated with a reactive unit, including, but notlimited to a building block, monomer, monomer unit, or reactant used insynthesizing the nucleic acid-encoded molecules.

DESCRIPTION OF THE FIGURES

FIG. 1 depicts nature's approach (gray) and the classical chemicalapproach (black) to generating molecular function.

FIG. 2 depicts certain DNA-templated reactions for nucleic acids andanalogs thereof.

FIG. 3 depicts the general method for synthesizing a polymer usingnucleic acid-templated synthesis.

FIG. 4 shows a quadruplet and triplet non-frameshifting codon set. Eachset provides nine possible codons.

FIG. 5 shows methods of screening a library for bond-cleavage andbond-formation catalysts. These methods take advantage of streptavidin'snatural affinity for biotin.

FIG. 6A depicts the synthesis directed by hairpin (H) and end-of-helix(E) DNA templates. Reactions were analyzed by denaturing PAGE after theindicated reaction times. Lanes 3 and 4 contained templates quenchedwith excess β-mercaptoethanol prior to reaction.

FIG. 6B depicts matched (M) or mismatched (X) reagents linked to thiols(S) or primary amines (N) were mixed with 1 equiv of templatefunctionalized with the variety of electrophiles shown. Reactions withthiol reagents were conducted at pH 7.5 under the following conditions:SIAB and SBAP: 37° C., 16 h; SIA: 25° C., 16 h, SMCC, GMBS, BMPS, SVSB:25° C., 10 min. Reactions with amine reagents were conducted at 25° C.,pH 8.5 for 75 minutes.

FIG. 7 depicts (a) H templates linked to a-iodoacetamide group whichwere reacted with thiol reagents containing 0, 1, or 3 mismatches at 25°C. (b) Reactions in (a) were repeated at the indicated temperature for16 h. Calculated reagent Tm: 38° C. (matched), 28° C. (single mismatch).

FIG. 8 depicts a reaction performed using a 41-base E template and a10-base reagent designed to anneal 1-30 bases from the 5′ end of thetemplate. The kinetic profiles in the graph show the average of twotrials (deviations <10%). The “n=1 mis” reagent contains threemismatches.

FIG. 9 depicts the repeated n=10 reaction in FIG. 8 in which the ninebases following the 5′-NH2-dT were replaced with the backbone analoguesshown. Five equivalents of a DNA oligonucleotide complementary to theintervening bases were added to the “DNA+clamp” reaction. Reagents werematched (0) or contained three mismatches (3). The gel shows reactionsat 25° C. after 25 min.

FIG. 10 depicts the n=1, n=10, and n=1 mismatched (mis) reactionsdescribed in FIG. 8 which were repeated with template and reagentconcentrations of 12.5, 25, 62.5 or 125 nM.

FIG. 11 depicts a model translation, selection and amplification ofsynthetic molecules that bind streptavidin from a DNA-encoded library.

FIG. 12 depicts (a) Lanes 1 and 5: PCT: amplified library beforestreptavidin binding selection. Lanes 2 and 6: PCR amplified libraryafter selection. Lanes 3 and 7: PCR amplified authentic biotin-encodingtemplate. Lane 4: 20 bp ladder. Lanes 5-7 were digested with Tsp45I. DNAsequencing traces of the amplified templates before and after selectionare also shown, together with the sequences of the non-biotin encodingand biotin-encoding templates. (b) General scheme for the creation andevolution of libraries of non-natural molecules using DNA-templatedsynthesis, where —R₁ represents the library of product functionalitytransferred from reagent library 1 and —R_(1B) represents a selectedproduct.

FIG. 13 depicts exemplary DNA-templated reactions. For all reactionsunder the specified conditions, product yields of reactions with matchedtemplate and reagent sequences were greater than 20-fold higher thanthat of control reactions with scrambled reagent sequences. Reactionswere conducted at 25° C. with one equivalent each of template andreagent at 60 nM final concentration unless otherwise specified.Conditions: a) 3 mM NaBH₃CN, 0.1 M MES buffer pH 6.0, 0.5 M NaCl, 1.5 h;b) 0.1 M TAPS buffer pH 8.5, 300 mM NaCl, 12 h; c) 0.1 M pH 8.0 TAPSbuffer, 1 M NaCl, 5° C., 1.5 h; d) 50 mM MOPS buffer pH 7.5, 2.8 M NaCl,22 h; e) 120 nM 19, 1.4 mM Na₂PdCl₄, 0.5 M NaOAc buffer pH 5.0, 18 h; f)Premix Na₂PdCl₄ with two equivalents of P(p-SO₃C₆H₄)₃ in water 15 min.,then add to reactants in 0.5 M NaOAc buffer pH 5.0, 75 mM NaCl, 2 h(final [Pd]=0.3 mM, [19]=120 nM). The olefin geometry of products from13 and the regiochemistries of cycloaddition products from 14 and 16 arepresumed but not verified.

FIG. 14 depicts analysis by denaturing polyacrylamide gelelectrophoresis of representative DNA-templated reactions listed inFIGS. 13 and 15. The structures of reagents and templates correspond tothe numbering in FIGS. 13 and 15. Lanes 1, 3, 5, 7, 9, 11: reaction ofmatched (complementary) reagents and templates under conditions listedin FIGS. 13 and 15 (the reaction of 4 and 6 was mediated by DMT-MM).Lanes 2, 4, 6, 8, 10, 12: reaction of mismatched (non-complementary)reagents and templates under conditions identical to those in lanes 1,3, 5, 7, 9 and 11, respectively.

FIG. 15 depicts DNA-templated amide bond formation mediated by EDC andsulfo-NHS or by DMT-MM for a variety of substituted carboxylic acids andamines. In each row, yields of DMT-MM-mediated reactions betweenreagents and templates complementary in sequence are followed by yieldsof EDC and sulfo-NHS-mediated reactions. Conditions: 60 nM template, 120nM reagent, 50 mM DMT-MM in 0.1 M MOPS buffer pH 7.0, 1 M NaCl, 16 h,25° C.; or 60 nM template, 120 nM reagent, 20 mM EDC, 15 mM sulfo-NHS,0.1 M MES buffer pH 6.0, 1 M NaCl, 16 h, 25° C. In all cases, controlreactions with mismatched reagent sequences yielded little or nodetectable product.

FIG. 16 depicts (a) Conceptual model for distance-independentDNA-templated synthesis. As the distance between the reactive groups ofan annealed reagent and template (n) is increased, the rate of bondformation is presumed to decrease. For those values of n in which therate of bond formation is significantly higher than the rate oftemplate-reagent annealing, the rate of product formation remainsconstant. In this regime, the DNA-templated reaction shows distanceindependence. (b) Denaturing polyacrylamide gel electrophoresis of aDNA-templated Wittig olefination between complementary 11 and 13 witheither zero bases (lanes 1-3) or ten bases (lanes 4-6) separatingannealed reactants. Although the apparent second order rate constantsfor the n=0 and n=10 reactions differ by three-fold (kapp (n=0)=9.9×10₃M⁻¹s⁻¹ while kapp (n=10)=3.5×10³ M⁻¹s⁻¹), product yields after 13 h atboth distances are nearly quantitative. Control reactions containingsequence mismatches yielded no detectable product (not shown).

FIG. 17 depicts certain exemplary DNA-templated complexity buildingreactions.

FIG. 18 depicts certain exemplary linkers for use in the method of theinvention.

FIG. 19 depicts certain additional exemplary linkers for use in themethod of the invention.

FIG. 20 depicts an exemplary thioester linker for use in the method ofthe invention.

FIG. 21 depicts DNA-templated amide bond formation reactions in whichreagents and templates are complexed with dimethyldidodecylammoniumcations.

FIG. 22 depicts the assembly of transfer units along the nucleic acidtemplate and polymerization of the nucleotide anti-codon moieties.

FIG. 23 depicts the polymerization of the dicarbamate units along thenucleic acid template to form a polycarbamate. To initiatepolymerization the “start” monomer ending in a o-nitrobenzylcarbamate isphotodeprotected to reveal the primary amine that initiates carbamatepolymerization. Polymerization then proceeds in the 5′ to 3′ directionalong the DNA backbone, with each nucleophilic attack resulting in thesubsequent unmasking of a new amine nucleophile. Attack of the “stop”monomer liberates an acetamide rather than an amine, thereby terminatingpolymerization.

FIG. 24 depicts cleavage of the polycarbamate from the nucleotidebackbone. Desilylation of the enol ether linker attaching the anti-codonmoiety to the monomer unit and the elimination of phosphate driven bythe resulting release of phenol provides the provides the polycarbamatecovalently linked at its carboxy terminus to its encodingsingle-stranded DNA.

FIG. 25 depicts components of an amplifiable, evolvable functionalizedpeptide nucleic acid library.

FIG. 26 depicts test reagents used to optimize reagents and conditionsfor DNA-templated PNA coupling.

FIG. 27 depicts a simple set of PNA monomers derived from commerciallyavailable building blocks useful for evolving a PNA-based fluorescentNi²⁺ sensor.

FIG. 28 depicts two schemes for the selection of a biotin-terminatedfunctionalized PNA capable of catalyzing an aldol or retroaldolreaction.

FIG. 29 depicts DNA-template-directed synthesis of a combinatorial smallmolecule library.

FIG. 30 shows schematically how DNA-linked small molecule scaffolds canbe functionalized sequence-specifically by reaction with syntheticreagents linked to complementary nucleic acid oligonucleotides, thisprocess can be repeated to complete the synthetic transformationsleading to a fully functionalized molecule.

FIG. 31 shows the functionalization of a cephalosporin small moleculescaffold with various reactants.

FIG. 32 depicts a way of measuring the rate of reaction between a fixednucleophile and an electrophile hybridized at varying distances along anucleic acid template to define an essential reaction window in whichnucleic acid-templated synthesis of nonpolymeric structures can takeplace.

FIG. 33 depicts three linker strategies for DNA-templated synthesis. Inthe autocleaving linker strategy, the bond connecting the product fromthe reagent oligonucleotide is cleaved as a natural consequence of thereaction. In the scarless and useful scar linker strategies, this bondis cleaved following the DNA-templated reaction. The depicted reactionswere analyzed by denaturing polyacrylamide gel electrophoresis (below).Lanes 1-3 were visualized using UV light without DNA staining; lanes4-10 were visualized by staining with ethidium bromide following by UVtransillumination. Conditions: 1 to 3: one equivalent each of reagentand template, 0.1 M TAPS buffer pH 8.5, 1 M NaCl, 25° C., 1.5 h; 4 to 6:three equivalents of 4, 0.1 M MES buffer pH 7.0, 1 M NaNO₂, 10 mM AgNO₃,37° C., 8 h; 8 to 9: 0.1 M CAPS buffer pH 11.8, 60 mM BME, 37° C., 2 h;11 to 12: 50 mM aqueous NaIO₄, 25° C., 2 h. R₁═NH(CH₂)₂NH-dansyl;R₂=biotin.

FIG. 34 depicts strategies for purifying products of DNA-templatedsynthesis. Using biotinylated reagent oligonucleotides, products arisingfrom an autocleaving linker are partially purified by washing the crudereaction with avidin-linked beads (top). Products generated fromDNA-templated reactions using the scarless or useful scar linkers can bepurified by using biotinylated reagent oligonucleotides, capturing crudereaction products with avidin-linked beads, and eluting desired productsby inducing linker cleavage (bottom).

FIG. 35 depicts the generation of an initial template pool for anexemplary library synthesis.

FIG. 36 depicts the DNA-templated synthesis of a non-natural peptidelibrary.

FIG. 37 depicts a 5′-reagent DNA-linker-amino acid.

FIG. 38 depicts the DNA-templated synthesis of an evolvable diversityoriented bicyclic library.

FIG. 39 depicts DNA-templated multi-step tripeptide synthesis. EachDNA-templated amide formation used reagents containing the sulfonelinker described in the text. Conditions: step 1: activate twoequivalents 13 in 20 mM EDC, 15 mM sulfo-NHS, 0.1 M MES buffer pH 5.5, 1M NaCl, 10 min, 25° C., then add to template in 0.1 M MOPS pH 7.5, 1MNaCl, 25° C., 1 h; steps 2 and 3: two equivalents of reagent, 50 mMDMT-MM, 0.1 M MOPS buffer pH 7.0, 1 M NaCl, 6 h, 25° C. Desired productafter each step was purified by capturing on avidin-linked beads andeluting with 0.1 M CAPS buffer pH 11.8, 60 mM BME, 37° C., 2 h. Theprogress of each reaction and purification was followed by denaturingpolyacrylamide gel electrophoresis (bottom). Lanes 3, 6, and 9: controlreactions using reagents containing scrambled oligonucleotide sequences.

FIG. 40 depicts Non-peptidic DNA-templated multi-step synthesis. Thereagent linkers used in steps 1, 2, and 3 were the diol linker,autocleaving Wittig linker, and sulfone linker, respectively; see FIG. 1for linker cleavage conditions. Conditions: 17 to 18: activate twoequivalents 17 in 20 mM EDC, 15 mM sulfo-NHS, 0.1 M MES buffer pH 5.5, 1M NaCl, 10 min, 25° C., then add to template in 0.1 M MOPS pH 7.5, 1MNaCl, 16° C., 8 h; 19 to 21: three equivalents 20, 0.1 M TAPS pH 9.0, 3M NaCl, 48 h, 25° C.; 22 to 23: three equivalents 22, 0.1 M TAPS pH 8.5,1 M NaCl, 21 h, 25° C. The progress of each reaction and purificationwas followed by denaturing polyacrylamide gel electrophoresis (bottom).Lanes 3, 6, and 9: control reactions using reagents containing scrambledoligonucleotide sequences.

FIG. 41 depicts the use of nucleic acids to direct the synthesis of newpolymers and plastics by attaching the nucleic acid to the ligand of apolymerization catalyst. The nucleic acid can fold into a complexstructure which can affect the selectivity and activity of the catalyst.

FIG. 42 depicts the use of Grubbs' ring-opening metathesispolymerization catalysis in evolving plastics. The synthetic scheme of adihydroimidazole ligand attached to DNA is shown as well as the monomerto be used in the polymerization reaction.

FIG. 43 depicts the evolution of plastics through iterative cycles ofligand diversification, selection and amplification to create polymerswith desired properties.

FIG. 44 depicts an exemplary scheme for the synthesis, in vitroselection and amplification of a library of compounds.

FIG. 45 depicts exemplary templates for use in recombination.

FIG. 46 depicts several exemplary deoxyribonucleotides andribonucleotides bearing modifications to groups that do not participatein Watson-Crick hydrogen bonding and are known to be inserted with highsequence fidelity opposite natural DNA templates.

FIG. 47 depicts exemplary metal binding uridine and 7-deazaadenosineanalogs.

FIG. 48 depicts the synthesis of analog (7).

FIG. 49 depicts the synthesis of analog (30).

FIG. 50 depicts the synthesis of 8-modified deoxyadenosinetriphosphates.

FIG. 51 depicts the results of an assay evaluating the acceptance ofmodified nucleotides by DNA polymerases.

FIG. 52 depicts the synthesis of 7-deazaadenosine derivatives.

FIG. 53 depicts certain exemplary nucleotide triphosphates.

FIG. 54 depicts a general method for the generation of libraries ofmetal-binding polymers.

FIGS. 55 and 56 depict exemplary schemes for the in vitro selections fornon-natural polymer catalysts.

FIG. 57 depicts an exemplary scheme for the in vitro selection ofcatalysts for Heck reactions, hetero Diels-Alder reactions and aldoladditions.

DESCRIPTION OF CERTAIN EMBODIMENTS OF THE INVENTION

As discussed above, it would be desirable to be able to evolve andamplify chemical compounds including, but not limited to small moleculesand polymers, in the same way that biopolymers such as polynucleotidesand proteins can be amplified and evolved. It has been demonstrated thatDNA-templated synthesis provides a possible means of translating theinformation in a sequence of DNA into a synthetic small molecule. Ingeneral, DNA templates linked to one reactant may be able to recruit asecond reactive group linked to a complementary DNA molecule to yield aproduct. Since DNA hybridization is sequence-specific, the result of aDNA-templated reaction is the translation of a specific DNA sequenceinto a corresponding reaction product. As shown in FIG. 2, the abilityof single-stranded nucleic acid templates to catalyze thesequence-specific oligomerization of complementary oligonucleotides (T.Inoue et al. J. Am. Chem. Soc. 1981, 103, 7666; T. Inou et al. J. Mol.Biol. 1984, 178, 669-76) has been demonstrated. This discovery was soonfollowed by findings that DNA or RNA templates could catalyze theoligomerization of complementary DNA or RNA mono-, di-, tri-, oroligonucleotides (T. Inoue et al. J. Am. Chem. Soc. 1981, 103, 7666; L.E. Orgel et al. Acc. Chem. Res. 1995, 28, 109-118; H. Rembold et al. J.Mol. Evol. 1994, 38, 205; L. Rodriguez et al. J. Mol. Evol. 1991, 33,477; C. B. Chen et al. J. Mol. Biol. 1985, 181, 271). DNA or RNAtemplates have since been shown to accelerate the formation of a varietyof non-natural nucleic acid analogs, including peptide nucleic acids (C.Bohler et al. Nature 1995, 376, 578), phosphorothioate- (M. K. Herrleinet al. J. Am. Chem. Soc. 1995, 117, 10151-10152), phosphoroselenate- (Y.Xu et al. J. Am. Chem. Soc. 2000, 122, 9040-9041; Y. Xu et al. Nat.Biotechnol. 2001, 19, 148-152) and phosphoramidate-(A. Luther et al.Nature 1998, 396, 245-8) containing nucleic acids, non-ribose nucleicacids (M. Bolli et al. Chem. Biol. 1997, 4, 309-20), and DNA analogs inwhich a phosphate linkage has been replaced with an aminoethyl group (Y.Gat et al. Biopolymers 1998, 48, 19-28). Nucleic acid templates can alsocatalyze amine acylation between nucleotide analogs (R. K. Bruick et al.Chem. Biol. 1996, 3, 49-56).

However, although the ability of nucleic acid templates to acceleratethe formation of a variety of non-natural nucleic acid analogues hasbeen demonstrated, nearly all of these reactions previously shown to becatalyzed by nucleic acid templates were designed to proceed throughtransition states closely resembling the structure of the naturalnucleic acid backbone (FIG. 2), typically affording products thatpreserve the same six-bond backbone spacing between nucleotide units.The motivation behind this design was presumably the assumption that therate enhancement provided by nucleic acid templates depends on a precisealignment of reactive groups, and the precision of this alignment ismaximized when the reactants and products mimic the structure of the DNAand RNA backbones. Evidence in support of the hypothesis thatDNA-templated synthesis can only generate products that resemble thenucleic acid backbone comes from the well-known difficulty ofmacrocyclization in organic synthesis (G. Illuminati et al. Acc. Chem.Res. 1981, 14, 95-102; R. B. Woodward et al. J. Am. Chem. Soc. 1981,103, 3210-3213). The rate enhancement of intramolecular ring closingreactions compared with their intermolecular counterparts is known todiminish quickly as rotatable bonds are added between reactive groups,such that linking reactants with a flexible 14-carbon linker hardlyaffords any rate acceleration (G. Illuminati et al. Acc. Chem. Res.1981, 14, 95-102).

Because synthetic molecules of interest do not in general resemblenucleic acid backbones, the use of DNA-templated synthesis to translateDNA sequences into synthetic small molecules would be broadly usefulonly if synthetic molecules other than nucleic acids and nucleic acidanalogs could be synthesized in a DNA-templated fashion. The ability ofDNA-templated synthesis to translate DNA sequences into arbitrarynon-natural small molecules therefore requires demonstrating thatDNA-templated synthesis is a much more general phenomenon than has beenpreviously described.

Significantly, for the first time it has been demonstrated herein thatDNA-templated synthesis is indeed a general phenomenon and can be usedfor a variety of reactions and conditions to generate a diverse range ofcompounds, specifically including for the first time, compounds that arenot, or do not resemble, nucleic acids or analogs thereof. Morespecifically, the present invention extends the ability to amplify andevolve libraries of chemical compounds beyond natural biopolymers. Theability to synthesize chemical compounds of arbitrary structure allowsresearchers to write their own genetic codes incorporating a wide rangeof chemical functionality into novel backbone and side-chain structures,which enables the development of novel catalysts, drugs, and polymers,to name a few examples. For example, the ability to directly amplify andevolve these molecules by genetic selection enables the discovery ofentirely new families of artificial catalysts which possess activity,bioavailability, solvent, or thermal stability, or other physicalproperties (such as fluorescence, spin-labeling, or photolability) whichare difficult or impossible to achieve using the limited set of naturalprotein and nucleic acid building blocks. Similarly, developing methodsto amplify and directly evolve synthetic small molecules by iteratedcycles of mutation and selection enables the isolation of novel ligandsor drugs with properties superior to those isolated by traditionalrational design or combinatorial screening drug discovery methods.Additionally, extending the approaches described herein to polymers ofsignificance in material science would enable the evolution of newplastics.

In general, the method of the invention involves 1) providing one ormore nucleic acid templates, which one or more nucleic acid templatesoptionally have a reactive unit associated therewith; and 2) contactingthe one or more nucleic acid templates with one or more transfer unitsdesigned to have a first moiety, an anti-codon which hybridizes to asequence of the nucleic acid, and is associated with a second moiety, areactive unit, which includes specific functionality, a building block,reactant, etc. for the compound to be synthesized. It will beappreciated that in certain embodiments, of the invention, the transferunit comprises one moiety incorporating the hybridization capability ofthe anti-codon unit and the chemical functionality of the reaction unit.Once these transfer units have hybridized to the nucleic acid templatein a sequence-specific manner, the synthesis of the chemical compoundcan take place due to the interaction of reactive units present on thetransfer units and/or the nucleic acid template. Significantly, thesequence of the nucleic acid can later be determined to decode thesynthetic history of the attached compound and thereby its structure. Itwill be appreciated that the method described herein may be used tosynthesize one molecule at a time or may be used to synthesize thousandsto millions of compounds using combinatorial methods.

It will be appreciated that a variety of chemical compounds can beprepared and evolved according to the method of the invention. Incertain embodiments of the invention, however, the methods are utilizedfor the synthesis of chemical compounds that are not, or do not,resemble nucleic acids or nucleic acid analogs. For example, in certainembodiments of the invention, small molecule compounds can besynthesized by providing a template which has a reactive unit (e.g.,building block or small molecule scaffold) associated therewith(attached directly or through a linker as described in more detail inExamples 5 herein), and contacting the template simultaneously orsequentially with one or more transfer units having one or more reactiveunits associated therewith. In certain other embodiments, non-naturalpolymers can be synthesized by providing a template and contacting thetemplate simultaneously with one or more transfer units having one ormore reactive units associated therewith under conditions suitable toeffect reaction of the adjacent reactive units on each of the transferunits (see, for example, FIG. 3, and examples 5 and 9, as described inmore detail herein).

Certain embodiments are discussed in more detail below; however, it willbe appreciated that the present invention is not intended to be limitedto those embodiments discussed below. Rather, the present invention isintended to encompass these embodiments and equivalents thereof.

Templates

As discussed above, one or more templates are utilized in the method ofthe invention and hybridize to the transfer units to direct thesynthesis of the chemical compound. As would be appreciated by one ofskill in this art, any template may be used in the methods andcompositions of the present invention. Templates which can be mutatedand thereby evolved can be used to guide the synthesis of anotherchemical compound or library of chemical compounds as described in thepresent invention. As described in more detail herein, the evolvabletemplate encodes the synthesis of a chemical compound and can be usedlater to decode the synthetic history of the chemical compound, toindirectly amplify the chemical compound, and/or to evolve (i.e.,diversify, select, and amplify) the chemical compound. The evolvabletemplate is, in certain embodiments, a nucleic acid. In certainembodiment of the present invention, the template is based on a nucleicacid.

The nucleic acid templates used in the present invention are made ofDNA, RNA, a hybrid of DNA and RNA, or a derivative of DNA and RNA, andmay be single- or double-stranded. The sequence of the template is usedin the inventive method to encode the synthesis of a chemical compound,preferably a compound that is not, or does not resemble, a nucleic acidor nucleic acid analog (e.g., an unnatural polymer or a small molecule).In the case of certain unnatural polymers, the nucleic acid template isused to align the monomer units in the sequence they will appear in thepolymer and to bring them in close proximity with adjacent monomer unitsalong the template so that they will react and become joined by acovalent bond. In the case of a small molecule, the template is used tobring particular reactants within proximity of the small moleculescaffold in order that they may modify the scaffold in a particular way.In certain other embodiments, the template can be utilized to generatenon-natural polymers by PCR amplification of a synthetic DNA templatelibrary consisting of a random region of nucleotides, as describe inExample 9 herein.

As would be appreciated by one of skill in the art, the sequence of thetemplate may be designed in a number of ways without going beyond thescope of the present invention. For example, the length of the codonmust be deter mined and the codon sequences must be set. If a codonlength of two is used, then using the four naturally occurring basesonly 16 possible combinations are available to be used in encoding thelibrary. If the length of the codon is increased to three (the numberNature uses in encoding proteins), the number of possible combinationsis increased to 64. Other factors to be considered in determining thelength of the codon are mismatching, frame-shifting, complexity oflibrary, etc. As the length of the codon is increased up to a certainextent the number of mismatches is decreased; however, excessively longcodons will hybridize despite mismatched base pairs. In certainembodiments of special interest, the length of the codon ranges between2 and 10 bases.

Another problem associated with using a nucleic acid template is frameshifting. In Nature, the problem of frame-shifting in the translation ofprotein from an mRNA is avoided by use of the complex machinery of theribosome. The inventive methods, however, will not take advantage ofsuch complex machinery. Instead, frameshifting may be remedied bylengthening each codon such that hybridization of a codon out of framewill guarantee a mismatch. For example, each codon may start with a G,and subsequent positions may be restricted to T, C, and A (FIG. 4). Inanother example, each codon may begin and end with a G, and subsequentpositions may be restricted to T, C, and A. Another way of avoidingframe shifting is to have the codons sufficiently long so that thesequence of the codon is only found within the sequence of the template“in frame”. Spacer sequences may also be placed in between the codons toprevent frame shifting.

It will be appreciated that the template can vary greatly in the numberof bases. For example, in certain embodiments, the template may be 10 to10,000 bases long, preferably between 10 and 1,000 bases long. Thelength of the template will of course depend on the length of thecodons, complexity of the library, length of the unnatural polymer to besynthesized, complexity of the small molecule to be synthesized, use ofspace sequences, etc. The nucleic acid sequence may be prepared usingany method known in the art to prepare nucleic acid sequences. Thesemethods include both in vivo and in vitro methods including PCR, plasmidpreparation, endonuclease digestion, solid phase synthesis, in vitrotranscription, strand separation, etc. In certain embodiments, thenucleic acid template is synthesized using an automated DNA synthesizer.

As discussed above, in certain embodiments of the invention, the methodis used to synthesize chemical compounds that are not, or do notresemble, nucleic acids or nucleic acid analogs. Although it has beendemonstrated that DNA-templated synthesis can be utilized to direct thesynthesis of nucleic acids and analogs thereof, it has not beenpreviously demonstrated that the phenomenon of DNA-templated synthesisis general enough to extend to other more complex chemical compounds(e.g., small molecules, non-natural polymers). As described in detailherein, it has been demonstrated that DNA-templated synthesis is indeeda more general phenomenon and that a variety of reactions can beutilized.

Thus, in certain embodiments of the present invention, the nucleic acidtemplate comprises sequences of bases that encode the synthesis of anunnatural polymer or small molecule. The message encoded in the nucleicacid template preferably begins with a specific codon that bring intoplace a chemically reactive site from which the polymerization can takeplace, or in the case of synthesizing a small molecule the “start” codonmay encode for an anti-codon associated with a small molecule scaffoldor a first reactant. The “start” codon of the present invention isanalogous to the “start” codon, ATG, found in Nature, which encodes forthe amino acid methionine. To give but one example for use insynthesizing an unnatural polymer library, the start codon may encodefor a start monomer unit comprising a primary amine masked by aphotolabile protecting group, as shown below in Example 5A.

In yet other embodiments of the invention, the nucleic acid templateitself may be modified to include an initiation site for polymersynthesis (e.g., a nucleophile) or a small molecule scaffold. In certainembodiments, the nucleic acid template includes a hairpin loop on one ofits ends terminating in a reactive group used to initiate polymerizationof the monomer units. For example, a DNA template may comprise a hairpinloop terminating in a 5′-amino group, which may be protected or not.From the amino group polymerization of the unnatural polymer maycommence. The reactive amino group can also be used to link a smallmolecule scaffold onto the nucleic acid template in order to synthesizea small molecule library.

To terminate the synthesis of the unnatural polymer a “stop” codonshould be included in the nucleic acid template preferably at the end ofthe encoding sequence. The “stop” codon of the present invention isanalogous to the “stop.” codons (i.e., TAA, TAG, TGA) found in mRNAtranscripts. In Nature, these codons lead to the termination of proteinsynthesis. In certain embodiments, a “stop” codon is chosen that iscompatible with the artificial genetic code used to encode the unnaturalpolymer. For example, the “stop” codon should not conflict with anyother codons used to encode the synthesis, and it should be of the samegeneral format as the other codons used in the template. The “stop”codon may encode for a monomer unit that terminates polymerization bynot providing a reactive group for further attachment. For example, astop monomer unit may contain a blocked reactive group such as anacetamide rather than a primary amine as shown in Example 5A below. Inyet other embodiments, the stop monomer unit comprises a biotinylatedterminus providing a convenient way of terminating the polymerizationstep and purifying the resulting polymer.

Transfer Units

As described above, in the method of the invention, transfer units arealso provided which comprise an anti-codon and a reactive unit. It willbe appreciated that the anti-codons used in the present invention aredesigned to be complementary to the codons present within the nucleicacid template, and should be designed with the nucleic acid template andthe codons used therein in mind. For example, the sequences used in thetemplate as well as the length of the codons would need to be taken intoaccount in designing the anti-codons. Any molecule which iscomplementary to a codon used in the template may be used in theinventive methods (e.g., nucleotides or non-natural nucleotides). Incertain embodiments, the codons comprise one or more bases found innature (i.e., thymidine, uracil, guanidine, cytosine, adenine). Incertain other embodiments, the anti-codon comprises one or morenucleotides normally found in Nature with a base, a sugar, and anoptional phosphate group. In yet other embodiments, the bases are strungout along a backbone that is not the sugar-phosphate backbone normallyfound in Nature (e.g., non-natural nucleotides).

As discussed above, the anti-codon is associated with a particular typeof reactive unit to form a transfer unit. It will be appreciated thatthis reactive unit may represent a distinct entity or may be part of thefunctionality of the anti-codon unit (see, Example 9). In certainembodiments, each anti-codon sequence is associated with one monomertype. For example, the anti-codon sequence ATTAG may be associated witha carbamate residue with an iso-butyl side chain, and the anti-codonsequence CATAG may be associated with a carbamate residue with a phenylside chain. This one-for-one mapping of anti-codon to monomer unitsallows one to decode any polymer of the library by sequencing thenucleic acid template used in the synthesis and allows one to synthesizethe same polymer or a related polymer by knowing the sequence of theoriginal polymer. It will be appreciated by one of skill in this artthat by changing (e.g., mutating) the sequence of the template,different monomer units will be brought into place, thereby allowing thesynthesis of related polymers, which can subsequently be selected andevolved. In certain preferred embodiments, several anti-codons may codefor one monomer unit as is the case in Nature.

In certain other embodiments of the present invention where a smallmolecule library is to be created rather than a polymer library, theanti-codon is associated with a reactant used to modify the smallmolecule scaffold. In certain embodiments, the reactant is associatedwith the anti-codon through a linker long enough to allow the reactantto come in contact with the small molecule scaffold. The linker shouldpreferably be of such a length and composition to allow forintramolecular reactions and minimize intermolecular reactions. Thereactants include a variety of reagents as demonstrated by the widerange of reactions that can be utilized in DNA-templated synthesis (seeExample 2, 3 and 4 herein) and can be any chemical group, catalyst(e.g., organometallic compounds), or reactive moiety (e.g.,electrophiles, nucleophiles) known in the chemical arts.

Additionally, the association between the anti-codon and the monomerunit or reactant in the transfer unit may be covalent or non-covalent.In certain embodiments of special interest, the association is through acovalent bond, and in certain embodiments the covalent linkage isseverable. The linkage may be cleaved by light, oxidation, hydrolysis,exposure to acid, exposure to base, reduction, etc. For examples oflinkages used in this art, please see Fruchtel et al. Angew. Chem. Int.Ed. Engl. 35:17, 1996, incorporated herein by reference. The anti-codonand the monomer unit or reactant may also be associated throughnon-covalent interactions such as ionic, electrostatic, hydrogenbonding, van der Waals interactions, hydrophobic interactions,pi-stacking, etc. and combinations thereof. To give but one example, theanti-codon may be linked to biotin, and the monomer unit linked tostreptavidin. The propensity of streptavidin to bind biotin leads to thenon-covalent association between the anti-codon and the monomer unit toform the transfer unit.

Synthesis of Certain Exemplary Compounds

It will be appreciated that a variety of compounds and/or libraries canbe prepared using the method of the invention. As discussed above, incertain embodiments of special interest, compounds that are not, or donot resemble, nucleic acids or analogs thereof, are synthesizedaccording to the method of the invention.

In certain embodiments, polymers, specifically unnatural polymers, areprepared according to the method of the present invention. The unnaturalpolymers that can be created using the inventive method and systeminclude any unnatural polymers. Exemplary unnatural polymers include,but are not limited to, polycarbamates, polyureas, polyesters,polyacrylate, polyalkylene (e.g., polyethylene, polypropylene),polycarbonates, polypeptides with unnatural stereochemistry,polypeptides with unnatural amino acids, and combination thereof. Incertain embodiments, the polymers comprises at least 10 monomer units.In certain other embodiments, the polymers comprise at least 50 monomerunits. In yet other embodiments, the polymers comprise at least 100monomer units. The polymers synthesized using the inventive system maybe used as catalysts, pharmaceuticals, metal chelators, materials, etc.

In preparing certain unnatural polymers, the monomer units attached tothe anti-codons and used in the present invention may be any monomers oroligomers capable of being joined together to form a polymer. Themonomer units may be carbamates, D-amino acids, unnatural amino acids,ureas, hydroxy acids, esters, carbonates, acrylates, ethers, etc. Incertain embodiments, the monomer units have two reactive groups used tolink the monomer unit into the growing polymer chain. Preferably, thetwo reactive groups are not the same so that the monomer unit may beincorporated into the polymer in a directional sense, for example, atone end may be an electrophile and at the other end a nucleophile.Reactive groups may include, but are not limited to, esters, amides,carboxylic acids, activated carbonyl groups, acid chlorides, amines,hydroxyl groups, thiols, etc. In certain embodiments, the reactivegroups are masked or protected (Greene & Wuts Protective Groups inOrganic Synthesis, 3rd Edition Wiley, 1999; incorporated herein byreference) so that polymerization may not take place until a desiredtime when the reactive groups are deprotected. Once the monomer unitsare assembled along the nucleic acid template, initiation of thepolymerization sequence results in a cascade of polymerization anddeprotection steps wherein the polymerization step results indeprotection of a reactive group to be used in the subsequentpolymerization step (see, FIG. 3).

The monomer units to be polymerized may comprise two or more unitsdepending on the geometry along the nucleic acid template. As would beappreciated by one of skill in this art, the monomer units to bepolymerized must be able to stretch along the nucleic acid template andparticularly across the distance spanned by its encoding anti-codon andoptional spacer sequence. In certain embodiments, the monomer unitactually comprises two monomers, for example, a dicarbamate, a diurea, adipeptide, etc. In yet other embodiments, the monomer unit actuallycomprises three or more monomers.

The monomer units may contain any chemical groups known in the art. Aswould be appreciated by one of skill in this art, reactive chemicalgroups especially those that would interfere with polymerization,hybridization, etc. are masked using known protecting groups (Greene &Wuts Protective Groups in Organic Synthesis, 3rd Edition Wiley, 1999;incorporated herein by reference). In general, the protecting groupsused to mask these reactive groups are orthogonal to those used inprotecting the groups used in the polymerization steps.

In synthesizing an unnatural polymer, in certain embodiments, a templateis provided encoding the sequence of monomer units. Transfer units arethen allow to contact the template under conditions that allow forhybridization of the anti-codons to the template. Polymerization of themonomer units along the template is then allowed to occur to form theunnatural polymer. The newly synthesized polymer may then be cleavedfrom the anti-codons and/or the template. The template may be used as atag to elucidate the structure of the polymer or may be used to amplifyand evolve the unnatural polymer. As will be described in more detailbelow, the present method may be used to prepare a library of unnaturalpolymers. For example, in certain embodiments, as described in moredetail in Example 9 herein, a library of DNA templates can be utilizedto prepare unnatural polymers. In general, the method takes advantage ofthe fact that certain DNA polymerases are able to accept certainmodified nucleotide triphosphate substrates and that severaldeoxyribonucleotides and ribonucleotides bearing modified groups that donot participate in Watson-Crick bonding are known to be inserted withhigh sequence specificity opposite natural DNA templates. Accordingly,single stranded DNA containing modified nucleotides can serve asefficient templates for the DNA-polymerase catalyzed incorporation ofnatural or modified nucleotides.

It will be appreciated that the inventive methods may also be used tosynthesize other classes of chemical compounds besides unnaturalpolymers. For example, small molecules may be prepared using the methodsand compositions provided by the present invention. These smallmolecules may be natural product-like, non-polymeric, and/ornon-oligomeric. The substantial interest in small molecules is due inpart to their use as the active ingredient in many pharmaceuticalpreparations although they may also be used as catalysts, materials,additives, etc.

In synthesizing small molecules using the method of the presentinvention, an evolvable template is also provided. The template mayeither comprise a small molecule scaffold upon which the small moleculeis to be built, or a small molecule scaffold may be added to thetemplate. The small molecule scaffold may be any chemical compound withsites for functionalization. For example, the small molecule scaffoldmay comprises a ring system (e.g., the ABCD steroid ring system found incholesterol) with functionalizable groups off the atoms making up therings. In another example, the small molecule may be the underlyingstructure of a pharmaceutical agent such as morphine or a cephalosporinantibiotic (see Examples 5C and 5D below). The sites or groups to befunctionalized on the small molecule scaffold may be protected usingmethods and protecting groups known in the art. The protecting groupsused in a small molecule scaffold may be orthogonal to one another sothat protecting groups can be removed one at a time.

In this embodiment, the transfer units comprise an anti-codon similar tothose described in the unnatural polymer synthesis; however, theseanti-codons are associated with reactants or building blocks to be usedin modifying, adding to, or taking away from the small moleculescaffold. The reactants or building blocks may be electrophiles (e.g.,acetyl, amides, acid chlorides, esters, nitriles, imines), nucleophiles(e.g., amines, hydroxyl groups, thiols), catalysts (e.g., organometalliccatalysts), side chains, etc. See, for example reactions in aqueous andorganic media as described herein in Examples 2 and 4. The transferunits are allowed to contact the template under hybridizing conditions,and the attached reactant or building block is allowed to react with asite on the small molecule scaffold. In certain embodiments, protectinggroups on the small molecule template are removed one at a time from thesites to be functionalized so that the reactant of the transfer unitwill react at only the desired position on the scaffold. As will beappreciated by one of skill in the art, the anti-codon may be associatedwith the reactant through a linker moiety (see, Example 3). The linkerfacilitates contact of the reactant with the small molecule scaffold andin certain embodiments, depending on the desired reaction, positions DNAas a leaving group (“autocleavable” strategy), or may link reactivegroups to the template via the “scarless” linker strategy (which yieldsproduct without leaving behind additional chemical functionality), or a“useful scar” strategy (in which the linker is left behind and can befunctionalized in subsequent steps following linker cleavage). Thereaction condition, linker, reactant, and site to be functionalized arechosen to avoid intermolecular reactions and accelerate intramolecularreactions. It will also be appreciated that the method of the presentinvention contemplates both sequential and simultaneous contacting ofthe template with transfer units depending on the particular compound tobe synthesized. In certain embodiments of special interest, themulti-step synthesis of chemical compounds is provided in which thetemplate is contacted sequentially with two or more transfer units tofacilitate multi-step synthesis of complex chemical compounds.

After the sites on the scaffold have been modified, the newlysynthesized small molecule is linked to the template that encoded issynthesis. Decoding of the template tag will allow one to elucidate thesynthetic history and thereby the structure of the small molecule. Thetemplate may also be amplified in order to create more of the desiredsmall molecule and/or the template may be evolved to create relatedsmall molecules. The small molecule may also be cleaved from thetemplate for purification or screening.

As would be appreciated by one of skill in this art, a plurality oftemplates may be used to encode the synthesis of a combinatorial libraryof small molecules using the method described above. This would allowfor the amplification and evolution of a small molecule library, a featwhich has not been accomplished before the present invention.

Method of Synthesizing Libraries of Compounds

In the inventive method, a nucleic acid template, as described above, isprovided to direct the synthesis of an unnatural polymer, a smallmolecule, or any other type of molecule of interest. In general, aplurality of nucleic acid templates is provided wherein the number ofdifferent sequences provided ranges from 2 to 10¹⁵. In one embodiment ofthe present invention, a plurality of nucleic acid templates isprovided, preferably at least 100 different nucleic acid templates, morepreferably at least 10000 different nucleic acid templates, and mostpreferably at least 1000000 different nucleic acid templates. Eachtemplate provided comprises a unique nucleic acid sequence used toencode the synthesis of a particular unnatural polymer or smallmolecule. As described above, the template may also have functionalitysuch as a primary amine from which the polymerization is initiated or asmall molecule scaffold. In certain embodiments, the nucleic acidtemplates are provided in one “pot”. In certain other embodiments, thetemplates are provided in aqueous media, and subsequent reactions areperformed in aqueous media.

To the template is added transfer units with anti-codons, as describedabove, associated with a monomer unit, as described above. In certainembodiments, a plurality of transfer units is provided so that there isan anti-codon for every codon represented in the template. In apreferred embodiment, certain anti-codons are used as start and stopsites. In general, a large enough number of transfer units is providedso that all corresponding codon sites on the template are filled afterhybridization.

The anti-codons of the transfer units are allowed to hybridize to thenucleic acid template thereby bringing the monomer units together in aspecific sequence as determined by the template. In the situation wherea small molecule library is being synthesized, reactants are brought inproximity to a small molecule scaffold. The hybridization conditions, aswould be appreciated by those of skill in the art, should preferablyallow for only perfect matching between the codon and its anti-codon.Even single base pair mismatches should be avoided. Hybridizationconditions may include, but are not limited to, temperature, saltconcentration, pH, concentration of template, concentration ofanti-codons, and solvent. The hybridization conditions used insynthesizing the library may depend on the length of thecodon/anti-codon, the similarity between the codons present in thetemplates, the content of G/C versus A/T base pairs, etc (for furtherinformation regarding hybridization conditions, please see, MolecularCloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch, andManiatis (Cold Spring Harbor Laboratory Press: 1989); Nucleic AcidHybridization (B. D. Hames & S. J. Higgins eds. 1984); the treatise,Methods in Enzymology (Academic Press, Inc., N.Y.); Ausubel et al.Current Protocols in Molecular Biology (John Wiley & Sons, Inc., NewYork, 1999); each of which is incorporated herein by reference).

After hybridization of the anti-codons to the codons on the templatehave occurred, the monomer units are then polymerized in the case of thesynthesis of unnatural polymers. The polymerization of the monomer unitsmay occur spontaneously or may need to be initiated, for example, by thedeprotection of a reactive groups such as a nucleophile or by providinglight of a certain wavelength. In certain other embodiments, polymerscan be catalyzed by DNA polymerization capable of effectingpolymerization of non-natural nucleotides (see, Example 9). Thepolymerization preferably occurs in one direction along the templatewith adjacent monomer units becoming joined through a covalent linkage.The termination of the polymerization step occurs by the addition of amonomer unit that is not capable of being added onto. In the case of thesynthesis of small molecules, the reactants are allowed to react withthe small molecule scaffold. The reactant may react spontaneously, orprotecting groups on the reactant and/or the small molecule scaffold mayneed to be removed. Other reagents (e.g., acid, base, catalyst, hydrogengas, etc.) may also be needed to effect the reaction (see, Examples5A-5E).

After the unnatural polymers or small molecules have been created withthe aid of the nucleic acid template, they may be cleaved from thenucleic acid template and/or anti-codons used to synthesize them. Incertain embodiments, the polymers or small molecules are assayed beforebeing completely detached from the nucleic acid templates that encodethem. Once the polymer or small molecule is selected, the sequence ofthe template or its complement may be determined to elucidate thestructure of the attached polymer or small molecule. This sequence maythen be amplified and/or evolved to create new libraries of relatedpolymers or small molecules that in turn may be screened and evolved.

Uses

The methods and compositions of the present invention represent a newway to generate molecules with desired properties. This approach marriesthe extremely powerful genetic methods, which molecular biologists havetaken advantage of for decades, with the flexibility and power oforganic chemistry. The ability to prepare, amplify, and evolve unnaturalpolymers by genetic selection may lead to new classes of catalysts thatpossess activity, bioavailability, stability, fluorescence,photolability, or other properties that are difficult or impossible toachieve using the limited set of building blocks found in proteins andnucleic acids. Similarly, developing new systems for preparing,amplifying, and evolving small molecules by iterated cycles of mutationand selection may lead to the isolation of novel ligands or drugs withproperties superior to those isolated by slower traditional drugdiscovery methods (see, Example 7).

Performing organic library synthesis on the molecular biology scale is afundamentally different approach from traditional solid phase librarysynthesis and carries significant advantages. A library created usingthe inventive methods can be screened using any method known in this art(e.g., binding assay, catalytic assay). For example, selection based onbinding to a target molecule can be carried out on the entire library bypassing the library over a resin covalently linked to the target. Thosebiopolymers that have affinity for the resin-bound target can be elutedwith free target molecules, and the selected compounds can be amplifiedusing the methods described above. Subsequent rounds of selection andamplification can result in a pool of compounds enriched with sequencesthat bind the target molecule. In certain embodiments, the targetmolecule mimics a transition state of a chemical reaction, and thechemical compounds selected may serve as a catalyst for the chemicalreaction. Because the information encoding the synthesis of eachmolecule is covalently attached to the molecule at one end, an entirelibrary can be screened at once and yet each molecule is selected on anindividual basis.

Such a library can also be evolved by introducing mutations at the DNAlevel using error-prone PCR (Cadwell et al. PCR Methods Appl. 2:28,1992; incorporated herein by reference) or by subjecting the DNA to invitro homologous recombination (Stemmer Proc. Natl. Acad. Sci. USA91:10747, 1994; Stemmer Nature 370:389, 1994; each of which isincorporated herein by reference). Repeated cycled of selection,amplification, and mutation may afford biopolymers with greatlyincreased binding affinity for target molecules or with significantlyimproved catalytic properties. The final pool of evolved biopolymershaving the desired properties can be sequenced by sequencing the nucleicacid cleaved from the polymers. The nucleic acid-free polymers can bepurified using any method known in the art including HPLC, columnchromatography, FLPC, etc., and its binding or catalytic properties canbe verified in the absence of covalently attached nucleic acid.

The polymerization of synthetically-generated monomer units independentof the ribosomal machinery allows the incorporation of an enormousvariety of side chains with novel chemical, biophysical, or biologicalproperties. Terminating each biopolymer with a biotin side chain, forexample, allows the facile purification of only full-length biopolymerswhich have been completely translated by passing the library through anavidin-linked resin. Biotin-terminated biopolymers can be selected forthe actual catalysis of bond-breaking reactions by passing thesebiopolymers over resin linked through the substrate to avidin (FIG. 5).Those biopolymers that catalyze substrate cleavage would self-elute froma column charged with this resin. Similarly, biotin-terminatedbiopolymers can be selected for the catalysis of bond-forming reactions(FIG. 5). One substrate is linked to resin and the second substrate islinked to avidin. Biopolymers that catalyze bond formation between thesubstrates are selected by their ability to ligate the substratestogether, resulting in attachment of the biopolymer to the resin. Novelside chains can also be used to introduce cofactor into the biopolymers.A side chain containing a metal chelator, for example, may providebiopolymers with metal-mediated catalytic properties, while aflavin-containing side chain may equip biopolymers with the potential tocatalyze redox reactions.

In this manner unnatural biopolymers may be isolated which serve asartificial receptors to selectively bind molecules or which catalyzechemical reactions. Characterization of these molecules would provideimportant insight into the ability of polycarbamates, polyureas,polyesters, polycarbonates, polypeptides with unnatural side chain andstereochemistries, or other unnatural polymers to form secondary ortertiary structures with binding or catalytic properties.

Kits

The present invention also provides kits and compositions for use in theinventive methods. The kits may contain any item or composition usefulin practicing the present invention. The kits may include, but is notlimited to, templates, anticodons, transfer units, monomer units,building blocks, reactants, small molecule scaffolds, buffers, solvents,enzymes (e.g., heat stable polymerase, reverse transcriptase, ligase,restriction endonuclease, exonuclease, Klenow fragment, polymerase,alkaline phosphatase, polynucleotide kinase), linkers, protectinggroups, polynucleotides, nucleosides, nucleotides, salts, acids, bases,solid supports, or any combinations thereof.

As would be appreciated by one of skill in this art, a kit for preparingunnatural polymers would contain items needed to prepare unnaturalpolymers using the inventive methods described herein. Such a kit mayinclude templates, anti-codons, transfer units, monomers units, orcombinations thereof. A kit for synthesizing small molecules may includetemplates, anti-codons, transfer units, building blocks, small moleculescaffolds, or combinations thereof.

The inventive kit may also be equipped with items needed to amplifyand/or evolve a polynucleotide template such as a heat stable polymerasefor PCR, nucleotides, buffer, and primers. In certain other embodiments,the inventive kit includes items commonly used in performing DNAshuffling such as polynucleotides, ligase, and nucleotides.

In addition to the templates and transfer units described herein, thepresent invention also includes compositions comprising complex smallmolecules, scaffolds, or unnatural polymer prepared by any one or moreof the methods of the invention as described herein.

EQUIVALENTS

The representative examples that follow are intended to help illustratethe invention, and are not intended to, nor should they be construed to,limit the scope of the invention. Indeed, various modifications of theinvention and many further embodiments thereof, in addition to thoseshown and described herein, will become apparent to those skilled in theart from the full contents of this document, including the exampleswhich follow and the references to the scientific and patent literaturecited herein. It should further be appreciated that the contents ofthose cited references are incorporated herein by reference to helpillustrate the state of the art.

The following examples contain important additional information,exemplification and guidance that can be adapted to the practice of thisinvention in its various embodiments and the equivalents thereof.

EXEMPLIFICATION Example 1

The Generality of DNA-Templated Synthesis: Clearly, implementing thesmall molecule evolution approach described above requires establishingthe generality of DNA-templated synthesis. The present invention, forthe first time, establishes the generality for this approach and thusenables the synthesis of a variety of chemical compounds usingDNA-templated synthesis. As shown in FIG. 6 a, the ability of two DNAarchitectures to support solution-phase DNA-templated synthesis wasestablished. Both hairpin (H) and end-of-helix (E) templates bearingelectrophilic maleimide groups reacted efficiently with one equivalentof thiol reagent linked to a complementary DNA oligonucleotide to yieldthe thioether product in minutes at 25° C. DNA-templated reaction rates(k_(app)=˜10⁵ M⁻¹s⁻¹) were similar for H and E architectures despitesignificant differences in the relative orientation of their reactivegroups. In contrast, no product was observed when using reagentscontaining sequence mismatches, or when using templates pre-quenchedwith excess β-mercaptoethanol (FIG. 6 a). Both templates thereforesupport the sequence-specific DNA-templated addition of a thiol to amaleimide even though the structures of the resulting products differmarkedly from the structure of the natural DNA backbone. Little or nonon-templated intermolecular reaction products are observed under thereaction conditions (pH 7.5, 25° C., 250 mM NaCl, 60 nM template andreagent).

Additionally, sequence-specific DNA-templated reactions spanning avariety of reaction types (S_(N)2 substitutions, additions toα,β-unsaturated carbonyl systems, and additions to vinyl sulfones),nucleophiles (thiols and amines), and reactant structures all proceededin good yields with excellent sequence selectivity (FIG. 6 b). Expectedproduct masses were verified by mass spectrometry. In each case, matchedbut not mismatched reagents afforded product efficiently despiteconsiderable variations in their transition state geometry, sterichindrance, and conformational flexibility. Collectively these findingsindicate that DNA-templated synthesis is a general phenomenon capable ofsupporting a range of reaction types, and is not limited to the creationof structures resembling nucleic acid backbones as described previously.

Since sequence discrimination is important for the faithful translationof DNA into synthetic structures, the reaction rate of a matched reagentcompared with that of a reagent bearing a single mismatched base nearthe center of its 10-base oligonucleotide was measured. At 25° C., theinitial rate of reaction of matched thiol reagents withiodoacetamide-linked H templates is 200-fold faster than that ofreagents bearing a single mismatch (k_(app)=2.4×10⁴ M⁻¹ s⁻¹ vs. 1.1×10²M⁻¹s⁻¹, FIG. 7). In addition, small amounts of products arising from theannealing of mismatched reagents can be eliminated by elevating thereaction temperature beyond the T_(m) of the mismatched reagents (FIG.7). The decrease in the rate of product formation as temperature iselevated further indicates that product formation proceeds by aDNA-templated mechanism rather than a simple intermolecular mechanism.

In addition to reaction generality and sequence specificity,DNA-templated synthesis also demonstrates remarkable distanceindependence. Both H and E templates linked to maleimide orα-iodoacetamide groups promote sequence-specific reaction with matched,but not mismatched, thiol reagents annealed anywhere on the templatesexamined thus far (up to 30 bases away from the reactive group on thetemplate). Reactants annealed one base away react with similar rates asthose annealed 2, 3, 4, 6, 8, 10, 15, 20, or 30 bases away (FIG. 8). Inall cases, templated reaction rates are several hundred-fold higher thanthe rate of untemplated (mismatched) reaction (k_(app)=10⁴-10⁵ M⁻¹s⁻¹vs. 5×10¹ M⁻¹s⁻¹). At intervening distances of 30 bases, products areefficiently formed presumably through transition states resembling200-membered rings. These findings contrast sharply with the well-knowndifficulty of macrocyclization (see, for example, G. Illuminati et al.Acc. Chem. Res. 1981, 14, 95-102; R. B. Woodward et al. J. Am. Chem.Soc. 1981, 103, 3210-3213) in organic synthesis.

To determine the basis of the distance independence of DNA-templatedsynthesis, a series of modified E templates were first synthesized inwhich the intervening bases were replaced by a series of DNA analogsdesigned to evaluate the possible contribution of (i) interbaseinteractions, (ii) conformational preferences of the DNA backbone, (iii)the charged phosphate backbone, and (iv) backbone hydrophilicity.Templates in which the intervening bases were replaced with any of theanalogs in FIG. 9 had little effect on the rates of product formation.These findings indicate that backbone structural elements specific toDNA are not responsible for the observed distance independence ofDNA-templated synthesis. However, the addition of a 10-base DNAoligonucleotide “clamp” complementary to the single-stranded interveningregion significantly reduced product formation (FIG. 9), suggesting thatthe flexibility of this region is critical to efficient DNA-templatedsynthesis.

The distance independent reaction rates may be explained if thebond-forming events in a DNA-templated format are sufficientlyaccelerated relative to their nontemplated counterparts such that DNAannealing, rather than bond formation, is rate-determining. If DNAannealing is at least partially rate limiting, then the rate of productformation should decrease as the concentration of reagents is loweredbecause annealing, unlike templated bond formation, is a bimolecularprocess. Decreasing the concentration of reactants in the case of the Etemplate with one or ten intervening bases between reactive groupsresulted in a marked decrease in the observed reaction rate (FIG. 10).This observation suggests that proximity effects in DNA-templatedsynthesis can enhance bond formation rates to the point that DNAannealing becomes rate-determining.

These findings raise the possibility of using DNA-templated synthesis totranslate in one pot libraries of DNA into solution-phase libraries ofsynthetic molecules suitable for PCR amplification and selection. Theability of DNA-templated synthesis to support a variety of transitionstate geometries suggests its potential in directing a range of powerfulwater-compatible synthetic reactions (see, Li, C. J. Organic Reactionsin Aqueous Media, Wiley and Sons, New York: 1997). The sequencespecificity described above suggests that mixtures of reagents may beable to react predictably with complementary mixtures of templates.Finally, the observed distance independence suggests that differentregions of DNA “codons” may be used to encode different groups on thesame synthetic scaffold without impairing reactions rates. As ademonstration of this approach, a library of 1,025 maleimide-linkedtemplates was synthesized, each with a different DNA sequence in aneight-base encoding region (FIG. 11). One of these sequences,5′-TGACGGGT-3′, was arbitrarily chosen to code for the attachment of abiotin group to the template. A library of thiol reagents linked to1,025 different oligonucleotides was also generated. The reagent linkedto 3′-ACTGCCCA-5′ contained a biotin group, while the other 1,024reagents contained no biotin. Equimolar ratios of all 1,025 templatesand 1,025 reagents were mixed in one pot for 10 minutes at 25° C. andthe resulting products were selected in vitro for binding tostreptavidin. Molecules surviving the selection were amplified by PCRand analyzed by restriction digestion and DNA sequencing.

Digestion with the restriction endonuclease Tsp45I, which cleaves GTGACand therefore cuts the biotin encoding template but none of the othertemplates, revealed a 1:1 ratio of biotin encoding to non-biotinencoding templates following selection (FIG. 12). This represents a1,000-fold enrichment compared with the unselected library. DNAsequencing of the PCR amplified pool before and after selectionsuggested a similar degree of enrichment and indicated that thebiotin-encoding template is the major product after selection andamplification (FIG. 12). The ability of DNA-templated synthesis tosupport the simultaneous sequence-specific reaction of 1,025 reagents,each of which faces a 1,024:1 ratio of non-partner to partner templates,demonstrates its potential as a method to create synthetic libraries inone pot. The above proof-of-principle translation, selection, andamplification of a synthetic library member having a specific property(avidin affinity in this example) addresses several key requirements forthe evolution of non-natural small molecule libraries toward desiredproperties.

Taken together, these results suggest that DNA-templated synthesis is asurprisingly general phenomenon capable of directing, rather than simplyencoding, a range of chemical reactions to form products unrelated instructure to nucleic acid backbones. For several reactions examined, theDNA-templated format accelerates the rate of bond formation beyond therate of a 10-base DNA oligonucleotide annealing to its complement,resulting in surprising distance independence. The facile nature oflong-distance DNA-templated reactions may also arise in part from thetendency of water to contract the volume of nonpolar reactants (see,C.-J. Li et al. Organic Reactions in Aqueous Media, Wiley and Sons: NewYork, 1997) and from possible compactness of the interveningsingle-stranded DNA between reactive groups. These findings may haveimplications for prebiotic evolution and for understanding themechanisms of catalytic nucleic acids, which typically localizesubstrates to a strand of RNA or DNA.

Methods:

DNA synthesis. DNA oligonucleotides were synthesized on a PerSeptiveBiosystems Expedite 8909 DNA synthesizer using standard protocols andpurified by reverse phase HPLC. Oligonucleotides were quantitatedspectrophotometrically and by denaturing polyacrylamide gelelectrophoresis (PAGE) followed by staining with ethidium bromide orSYBR Green (Molecular Probes) and quantitation using a Stratagene EagleEye II densitometer. Phosphoramidites enabling the synthesis of5′-NH₂-dT, 5′ tetrachlorofluorescein, abasic backbone spacer, C3backbone spacer, 9-bond polyethylene glycol spacer, 12-bond saturatedhydrocarbon spacer, and 5′ biotin groups were purchased from GlenResearch. Thiol-linked oligonucleotide reagents were synthesized on C3disulfide controlled pore glass (Glen Research).

Template functionalization. Templates bearing 5′-NH₂-dT groups weretransformed into a variety of electrophilic functional groups byreaction with the appropriate electrophile-NHS ester (Pierce). Reactionswere performed in 200 mM sodium phosphate pH 7.2 with 2 mg/mLelectrophile-NHS ester, 10% DMSO, and up to 100 μg of 5′-amino templateat 25° C. for 1 h. Desired products were purified by reverse-phase HPLCand characterized by gel electrophoresis and MALDI mass spectrometry.

DNA-templated synthesis reactions. Reactions were initiated by mixingequimolar quantities of reagent and template in buffer containing 50 mMMOPS pH 7.5 and 250 mM NaCl at the desired temperature (25° C. unlessstated otherwise). Concentrations of reagents and templates were 60 nMunless otherwise indicated. At various time points, aliquots wereremoved, quenched with excess β-mercaptoethanol, and analyzed bydenaturing PAGE. Reaction products were quantitated by densitometryusing their intrinsic fluorescence or by staining followed bydensitometry. Representative products were also verified by MALDI massspectrometry.

In vitro selection for avidin binding. Products of the librarytranslation reaction were isolated by ethanol precipitation anddissolved in binding buffer (10 mM Tris pH 8, 1 M NaCl, 10 mM EDTA).Products were incubated with 30 μg of streptavidin-linked magnetic beads(Roche Biosciences) for 10 min at room temperature in 100 uL totalvolume. Beads were washed 16 times with binding buffer and eluted bytreatment with 1 μmol free biotin in 100 uL binding buffer at 70° C. for10 minutes. Eluted molecules were isolated by ethanol precipitation andamplified by standard PCR protocols (2 mM MgCl₂, 55° C. annealing, 20cycles) using the primers 5′-TGGTGCGGAGCCGCCG and5′-CCACTGTCCGTGGCGCGACCCCGGCTCC TCGGCTCGG. Automated DNA sequencing usedthe primer 5′-CCACTGTCCGTGGCGCGACCC.

DNA Sequences. Sequences not provided in the figures are as follows:matched reagent in FIG. 6 b SIAB and SBAP reactions:5′CCCGAGTCGAAGTCGTACC-SH; mismatched reagent in FIG. 6 b SIAB and SBAPreactions: 5′-GGGCTCAGCTTCCCCATAA-SH; mismatched reagents for otherreactions in FIGS. 6 b, 6 c, 6 d, and 8 a; 5′-FAAATCTTCCC-SH(F=tetrachlorofluorescein); reagents in FIGS. 6 c and 6 d containing onemismatch: 5′-FAATTCTTACC-SH; E templates in FIGS. 6 a, 6 b SMCC, GMBS,BMPS, and SVSB reactions, and 8 a:5′-(NH₂dT)-CGCGAGCGTACGCTCGCGATGGTACGAATTCGACTCGGGAATACCACCTTCGACTCGAGG; H template in FIG. 6 b SIAB, SBAP, and SIA reactions:5′-(NH₂dT)-CGCGAGCGTACG CTCGCG ATGGTACGAATTC; clamp oligonucleotide inFIG. 8 b: 5′-ATTCGTACCA

Example 2 Exemplary Reactions for Use in DNA-Templated Synthesis

As discussed above, the generality of DNA-templated synthetic chemistrywas examined (see, Liu et al. J. Am. Chem. Soc. 2001, 123, 6961).Specifically, the ability of DNA-templated synthesis to direct a modestcollection of chemical reactions without requiring the precise alignmentof reactive groups into DNA-like conformations was demonstrated. Indeed,the distance independence and sequence fidelity of DNA-templatedsynthesis allowed the simultaneous, one-pot translation of a modellibrary of more than 1,000 templates into the corresponding thioetherproducts, one of which was enriched by in vitro selection for binding tothe protein streptavidin and amplified by PCR.

As described in detail herein, the generality of DNA-templated synthesishas been further expanded and it has been demonstrated that a variety ofchemical reactions can be utilized for the construction of smallmolecules and in particular, for the first time, DNA-templatedorganometallic couplings and carbon-carbon bond forming reactions otherthan pyrimidine photodimerization. These reactions clearly represent animportant step towards the in vitro evolution of non-natural syntheticmolecules by enabling the DNA-templated construction of a much morediverse set of structures than has previously been achieved.

The ability of DNA-templated synthesis to direct reactions that requirea non-DNA-linked activator, catalyst or other reagent in addition to theprincipal reactants has also been demonstrated herein. To test theability of DNA-templated synthesis to mediate such reactions withoutrequiring structural mimicry of the DNA-templated backbone,DNA-templated reductive aminations between an amine-linked template (1)and benzaldehyde- or glyoxal-linked reagents (3) with millimolarconcentrations of NaBH₃CN at room temperature in aqueous solutions canbe performed. Significantly, products formed efficiently when thetemplate and reagent sequences were complementary, while controlreactions in which the sequence of the reagent did not complement thatof the template, or in which NaBH₃CN was omitted, yielded no significantproduct (see FIGS. 13 and 14). Although DNA-templated reductiveaminations to generate products closely mimicking the structure ofdouble-stranded DNA have been previously reported (see, for example, X.Li et al. J. Am. Chem. Soc. 2002, 124, 746 and Y. Gat et al. Biopolymers1998, 48, 19), the above results demonstrate that reductive amination togenerate structures unrelated to the phosphoribose backbone can takeplace efficiently and sequence-specifically. Referring to FIG. 15,DNA-templated aide bond formations between amine-linked templates 4 and5 and carboxylate-linked reagents 6-9 mediated by1-(3-dimethylaminopropyl)-3-ethylcarbodiimide (EDC) andN-hydroxylsulfosuccinimide (sulfo-NHS) to generate amide products ingood yields at pH 6.0, 25° C. (FIG. 15). Product formation wassequence-specific, dependent on the presence of EDC, and suprisinglyinsensitive to the steric encumbrance of the amine or carboxylate.Efficient DNA-templated amide formation was also mediated by thewater-stable activator4-(4,6-dimethoxy-1,3,5-trizin-2-yl)-4-methylmorpholinium chloride(DMT-MM) instead of EDC and sulfo-NHS (FIGS. 14 and 15). The efficiencyand generality of DNA-templated amide bond formation under theseconditions, together with the large number of commercially availablechiral amines and carboxylic acids, make this reaction an attractivecandidate in future DNA-templated syntheses of structurally diversesmall molecule libraries.

It will be appreciated that carbon-carbon bond forming reactions arealso important in both chemical and biological syntheses and thusseveral such reactions are utilized in DNA-templated format. Both thereaction of nitroalkane-linked reagent (10) with aldehyde-linkedtemplate (11) (nitro-aldol or Henry reaction) and the conjugate additionof 10 to maleimide-linked template (12) (nitro-Michael addition)proceeded efficiently and with high sequence specificity at pH 7.5-8.5,25° C. (FIGS. 13 and 14). In addition, the sequence-specificDNA-templated Wittig reaction between stabilized phosphorus ylidereagent 13 and aldehyde-linked templates 14 or 11 provided thecorresponding olefin products in excellent yields at pH 6.0-8.0, 25° C.(FIGS. 13 and 14). Similarly, the DNA templated 1,3-dipolarcycloaddition between nitrone-linked reagents 15 and 16 andolefin-linked templates 12, 17 or 18 also afforded products sequencespecifically at pH 7.5, 25° C. (FIGS. 13 and 14).

In addition to the reactions described above, organometallic couplingreactions can also be utilized in the present invention. For example,DNA-templated Heck reactions were performed in the presence ofwater-soluble Pd precatalysts. In the presence of 170 mM Na₂PdCl₄, aryliodide-linked reagent 19 and a variety of olefin-linked templatesincluding maleimide 12, acrylamide 17, vinyl sulfone 18 or cinnamamide20 yielded Heck coupling products in modest yields at pH 5.0, 25° C.(FIGS. 13 and 14). For couplings with olefins 17, 18 and 20, adding twoequivalents of P(p-SO₃C₆H₄)₃ per equivalent of Pd prior to template andreagent addition typically increased overall yields by 2-fold. Controlreactions containing sequence mismatches or lacking Pd precatalystyielded no product. To our knowledge, the above DNA-templated nitroaldol addition, nitro Michael addition, Wittig olefination, dipolarcycloaddition, and Heck coupling represent the first reportednucleic-acid templated organometallic reactions and carbon-carbon bondforming reactions other than pyrimidine photodimerization.

It was previously discovered that the same DNA-templated reactionsdemonstrate distance independence, the ability to form product at a rateindependent of the number of intervening bases between annealedreactants. It was hypothesized (FIG. 16 a) that distance independencearises when the rate of bond formation in the DNA-templated reaction isgreater than the rate of template-reagent annealing. Although only asubset of chemistries fall into this category, any DNA-templatedreaction that affords comparable product yields when the reagent isannealed at various distances from the reactive end of the template isof special interest because it can be encoded at a variety of templatepositions. To evaluate the ability of the DNA-templated reactionsdeveloped above to take place efficiently when reactants are separatedby distances relevant to library encoding, the yields of reductiveamination, amide formation, nitro-aldol addition, nitro-Michaeladdition, Wittig olefination, dipolar cycloaddition, and Heck couplingwhen zero or ten bases separated annealed reactive groups (FIG. 16 a,n=0 versus n=10) were compared. Among the reactions described above orin previous work, amide bond formation, nitro-aldol addition, Wittigolefination, Heck coupling, conjugate addition of thiols to maleimidesand S_(N)2 reaction between thiols and α-iodo amides demonstratecomparable product formation when reactive groups are separated by zeroor ten bases (FIG. 16 b). These findings indicate that these reactionscan be encoded during synthesis by nucleotides that are distal from thereactive end of the template without significantly impairing productformation.

In addition to the DNA-templated S_(N)2 reaction, conjugate addition,vinyl sulfone addition, amide bond formation, reductive amination,nitro-aldol (Henry reaction), nitro Michael, Wittig olefination,1,3-dipolar cycloaddition and Heck coupling reactions described directlyabove, a variety of additional reagents can also be utilized in themethod of the present invention. For example, as depicted in FIG. 17,powerful aqueous DNA-templated synthetic reactions including, but notlimited to, the Lewis acid-catalyzed aldol addition, Mannich reaction,Robinson annulation reactions, additions of allyl indium, zinc and tinto ketones and aldehydes, Pd-assisted allylic substitution, Diels-Aldercycloadditions, and hetero-Diels-Alder reactions can be utilizedefficiently in aqueous solvent and are important complexity-buildingreactions.

Taken together, these results expand considerably the reaction scope ofDNA-templated synthesis. A wide variety of reactions proceededefficiently and selectively only when the corresponding reactants areprogrammed with complementary sequences. By augmenting the repertoire ofknown DNA-templated reactions to now include carbon-carbon bond formingand organometallic reactions (nitro-aldol additions, nitro-Michaeladditions, Wittig olefinations, dipolar cycloadditions, and Heckcouplings) in addition to previously reported amide bond formation (see,Schmidt et al. Nucleic Acids Res. 1997, 25, 4792; Bruick et al. Chem.Biol. 1996, 3, 49), imine formation (Czlapinski et al. J. Am. Chem. Soc.2001, 123, 8618), reductive amination (Li et al. J. Am. Chem. Soc. 2002,124, 746; Gat et al. Biopolymers, 1998, 48, 19), S_(N)2 reactions(Gartner et al. J. Am. Chem. Soc. 2001, 123, 6961; Xu et al. Nat.Biotechnol. 2001, 19, 148; Herrlein et al. J. Am. Chem. Soc. 1995, 117,10151) conjugate addition of thiols (Gartner et al. J. Am. Chem. Soc.2001, 123, 6961), and phosphoester or phosphonamide formation (Orgel etal. Acc. Chem. Res. 1995, 28, 109; Luther et al. Nature, 1998, 396,245), these results may enable the sequence-specific translation oflibraries of DNA into libraries of structurally and functionally diversesynthetic products. Since minute quantities of templates encodingdesired molecules can be amplified by PCR, the yields of DNA-templatedreactions are arguably less critical than the yields of traditionalsynthetic transformations. Nevertheless, many of the reactions developedabove proceed efficiently. In addition, by demonstrating thatDNA-templated synthesis in the absence of proteins can direct a largediversity of chemical reactions, these findings support previouslyproposed hypotheses that nucleic acid-templated synthesis may havetranslated replicable information into some of the earliest functionalmolecules such as polyketides, terpenes and polypeptides prior to theevolution of protein-based enzymes. The diversity of chemistry shownhere to be controllable simply by bringing reactants into proximity byDNA hybridization without obvious structural requirements provides anexperimental basis for these possibilities. The translation ofamplifiable information into a wide range of structures is a keyrequirement for applying nature's molecular evolution approach to thediscovery of non-natural molecules with new functions.

Methods for Exemplary Reactions for Use in DNA-Templated Synthesis:

Functionalized templates and reagents were typically prepared byreacting 5′-NH₂ terminated oligonucleotides (for template 1),5′-NH₂—(CH₂O)₂ terminated oligonucleotides (for all other templates) or3′-OPO₃—CH₂CH(CH₂OH)(CH₂)₄NH₂ terminated nucleotides (for all reagents)with the appropriate NHS esters (0.1 volumes of a 20 mg/mL solution inDMF) in 0.2 M sodium phosphate buffer, pH 7.2, 25° C., 1 h to providethe template and reagent structures shown in FIGS. 13 and 15. For aminoacid linked reagents 6-9, 3′-OPO₃CH₂CH(CH₂OH)(CH₂)₄NH₂ terminatedoligonucleotides in 0.2 M sodium phosphate buffer, pH 7.2 were reactedwith 0.1 volumes of a 100 mMbis[2-(succinimidyloxycarbonyloxy)ethyl]sulfone (BSOCOES, Pierce)solution in DMF for 10 min at 25° C., followed by 0.3 volumes of a 300mM amino acid in 300 mM NaOH for 30 min at 25° C.

Functionalized templates and reagents were purified by gel filtrationusing Sephadex G-25 followed by reverse-phase HPLC (0.1 triethylammoniumacetate-acetonitrile gradient) and characterized by MALDI massspectrometry. DNA templated reactions were conducted under theconditions described in FIGS. 13 and 15 and products were characterizedby denaturing polyacrylamide gel electrophoresis and MALDI massspectrometry.

The sequences of oligonucleotide templates and reagents are as follows(5′ to 3′ direction, n refers to the number of bases between reactivegroups when template and reagent are annealed as shown in FIG. 16). 1:TGGTACGAATTCGACTCGGG; 2 and 3 matched: GAGTCGAATTCGTACC; 2 and 3mismatched: GGGCTCAGCTTCCCCA; 4 and 5: GGTACGAATTCGACTCGGGAATACCACCTT;6-9 matched (n=10): TCCCGAGTCG; 6 matched (n=0): AATTCGTACC; 6-9mismatched: TCACCTAGCA; 11, 12, 14, 17, 18, 20: GGTACGAATTCGACTCGGGA;10, 13, 16, 19 matched: TCCCGAGTCGAATTCGTACC; 10, 13 16, 19 mismatched:GGGCTCAGCTTCCCCATAAT; 15 matched: AATTCGTACC; 15 mismatched: TCGTATTCCA;template for n=10 vs. n=0 comparison: TAGCGATTACGGTACGAATTCGACTCGGGA

Reaction yields quantitated by denaturing polyacrylamide gelelectrophoresis followed by ethidium bromide staining, UV visualization,and CCD-based densitometry of product and template starting materialbands. Yield calculations assumed that templates and products stainedwith equal intensity per base; for those cases in which products arepartially double-stranded during quantitation, changes in stainingintensity may result in higher apparent yields.

Example 3 Development of Exemplary Linkers

As will be appreciated by one of ordinary skill in the art, it isfrequently useful to leave the DNA moiety of the reagents linked toproducts during reaction development to facilitate analysis by gelelectrophoresis. The use of DNA-templated synthesis to translatelibraries of DNA into corresponding libraries of synthetic smallmolecules suitable for in vitro selection, however, requires thedevelopment of cleavable linkers connecting reactive groups of reagentswith their decoding DNA oligonucleotides. As described below and herein,three exemplary types of linkers have been developed (see, FIG. 18). Forreagents with one reactive group, it would be desirable to position DNAas a leaving group to the reactive moiety. Under this “autocleavable”linker strategy, the DNA-reactive group bond is cleaved as a naturalconsequence of the reaction. As but one example of this approach, afluorescent Wittig phosphorane reagent (14, referring to FIG. 19) wassynthesized in which the decoding DNA oligonucleotide was attached toone of the aryl phosphine groups (see, FIG. 19, left). DNA-templatedWittig reaction with aldehyde-linked templates resulted in the nearlyquantitative transfer of the fluorescent group from the Wittig reagentto the template and the concomitant liberation of the alkene productfrom the DNA moiety of the reagent. Additionally, reagents bearing morethan one reactive group can be linked to their decoding DNAoligonucleotides through one of two additional linker strategies. In the“scarless” linker strategy, DNA-templated reaction of one reactive groupis followed by cleavage of the linker attached through a second reactivegroup to yield products without leaving behind additional chemicalfunctionality. For example, a series of amino acid reagents weresynthesized which were connected through a carbamoylethylsulfone linkerto their decoding DNA oligonucleotides (FIG. 19, center). Products ofDNA-templated amide bond formation using these amino acid reagents weretreated with aqueous alkaline buffer to effect the quantitativeelimination and spontaneous decarboxylation of the carbamoyl group. Theproduct of leaving this scarless linker is therefore the cleanlytransferred amino acid moiety. In yet other embodiment of the invention,a third linker strategy, a “useful scar” may be utilized on the theorythat it may be advantageous to introduce useful chemical groups as aconsequence of linker cleavage. In particular, a “useful scar” can befunctionalized in subsequent steps and is left behind following linkercleavage. For example, amino acid reagents linked through 1,2-diols totheir decoding DNA oligonucleotides were generated. Following amide bondformation, this linker was quantitatively cleaved by oxidation withNaIO₄ to afford products bearing useful aldehyde groups (see, FIG. 19,right). In addition to the linkers described directly above, a varietyof additional linkers can be utilized. For example, as shown in FIG. 20,a thioester linker can be generated by carbodiimide-mediated coupling ofthiol-terminated DNA with carboxylate-containing reagents and can becleaved with aqueous base. As the carboxylate group provides entry intothe DNA-templated amide bond formation reactions described above, thislinker would liberate a “useful scar” when cleaved (see, FIG. 20).Alternatively, the thioester linker can be used as an autocleavablelinker during an amine acylation reaction in the presence of Ag(I)cations (see, Zhang et al. J. Am. Chem. Soc. 1999, 121, 3311-3320) sincethe thiol-DNA moiety of the reagent is liberated as a naturalconsequence of the reaction. It will be appreciated that a thioetherlinker that can be oxidized and eliminated at pH 11 to liberate a vinylsulfone can be utilized as a “useful scar” linker. As demonstratedherein, the vinyl sulfone group serves as the substrate in a number ofsubsequent DNA-templated reactions.

Example 4 Exemplary Reactions in Organic Solvents

As demonstrated herein, a variety of DNA-templated reactions can occurin aqueous media. It has also been demonstrated, as discussed below,that DNA-templated reactions can occur in organic solvents, thus greatlyexpanding the scope of DNA-templated synthesis. Specifically, DNAtemplates and reagents have been complexed with long chaintetraalkylammonium cations (see, Jost et al. Nucleic Acids Res. 1989,17, 2143; Mel'nikov et al. Langmuir, 1999, 15, 1923-1928) to enablequantitative dissolution of reaction components in anhydrous organicsolvents including CH₂Cl₂, CHCl₃, DMF and MeOH. Surprisingly, it wasfound that DNA-templated synthesis can indeed occur in anhydrous organicsolvents with high sequence selectivity. Depicted in FIG. 21 areDNA-templated amide bond formation reactions in which reagents andtemplates are complexed with dimethyldidodecylammonium cations either inseparate vessels or after preannealing in water, lyophilized to dryness,dissolved in CH₂Cl₂, and mixed together. Matched, but not mismatched,reactions provided products both when reactants were preannealed inaqueous solution and when they were mixed for the first time in CH₂Cl²(see, FIG. 21). DNA-templated amide formation and Pd-mediated Heckcoupling in anhydrous DMF also proceeded sequence-specifically. Clearly,these observations of sequence-specific DNA-templated synthesis inorganic solvents implies the presence of at least some secondarystructure within tetraalkylammonium-complexed DNA in organic media, andshould enable DNA receptors and catalysts to be evolved towardsstereoselective binding or catalytic properties in organic solvents.Specifically, DNA-templated reactions that are known to occur in aqueousmedia, including conjugate additions, cycloadditions, displacementreactions, and Pd-mediated couplings can also be performed in organicsolvents. In certain other embodiments, reactions in organic solventsmay be utilized that are inefficient or impossible to perform in water.For example, while Ru-catalyzed olefin metathesis in water has beenreported by Grubbs and co-workers (see, Lynn et al. J. Am. Chem. Soc.1998, 120, 1627-1628; Lynn et al. J. Am. Chem. Soc. 2000, 122,6601-6609; Mohr et al. Organometallics 1996, 15, 4317-4325), the aqueousmetathesis system is extremely functional group sensitive. Thefunctional group tolerance of Ru-catalyzed olefin metathesis in organicsolvents, however, is significantly more robust. Some exemplaryreactions to utilize in organic solvents include, but are not limited to1,3-dipolar cycloaddition between nitrones and olefins which can proceedthrough transition states that are less polar than ground state startingmaterials.

As detailed above, the generality of DNA-templated synthesis has beenestablished by performing several distinct DNA-templated reaction types,none of which are limited to producing structures that resemble thenatural nucleic acid backbone, and many of which are highly usefulcarbon-carbon bond forming or complexity-building synthetic reactions.It has been shown that the distance independence of DNA-templatedsynthesis allows different regions of a DNA template to each encodedifferent synthetic reactions. DNA-templated synthesis can maintainsequence fidelity even in a library format in which more than 1,000templates and 1,000 reagents react simultaneously in one pot. Asdescribed above and below, linker strategies have been developed, whichtogether with the reactions developed as described above, have enabledthe first multi-step DNA-templated synthesis of simple synthetic smallmolecules. Additionally, the sequence-specific DNA-templated synthesisin organic solvents has been demonstrated, further expanding the scopeof this approach.

Example 5 Synthesis of Exemplary Compounds and Libraries of Compounds

A) Synthesis of a Polycarbamate Library: One embodiment of the strategydescribed above is the creation of an amplifiable polycarbamate library.Of the sixteen possible dinucleotides used to encode the library, one isassigned a start codon function, and one is assigned to serve as a stopcodon. An artificial genetic code is then created assigning each of theup to 14 remaining dinucleotides to a different monomer. For geometricreasons one monomer actually contains a dicarbamate containing two sidechains. Within each monomer, the dicarbamate is attached to thecorresponding dinucleotide (analogous to a tRNA anticodon) through asilyl enol ether linker which liberates the native DNA and the freecarbamate upon treatment with fluoride. The dinucleotide moiety existsas the activated 5′-2-methylimidazole phosphate, that has beendemonstrated (Inoue et al. J. Mol. Biol. 162:201, 1982; Rembold et al.J. Mol. Evol. 38:205, 1994; Chen et al. J. Mol. Biol. 181:271, 1985;Acevedo et al. J. Mol. Biol. 197:187, 1987; Inoue et al. J. Am. Chem.Soc. 103:7666, 1981; each of which is incorporated herein by reference)to serve as an excellent leaving group for template-directedoligomerization of nucleotides yet is relatively stable under neutral orbasic aqueous conditions (Schwartz et al. Science 228:585, 1985;incorporated herein by reference). The dicarbamate moiety exists in acyclic form linked through a vinyloxycarbonate linker. Thevinylcarbonate group has been demonstrated to be stable in neutral orbasic aqueous conditions (Olofson et al. Tetrahedron Lett. 18:1563,1977; Olofson et al. Tetrahedron Lett. 18:1567, 1977; Olofson et al.Tetrahedron Lett. 18:1571, 1977; each of which is incorporated herein byreference) and further has been shown to provide carbamates in very highyields upon the addition of amines (Olofson et al. Tetrahedron Lett.18:1563, 1977; incorporated herein by reference).

When attacked by an amine from a nascent polycarbamate chain, the vinylcarbonate linker, driven by the aromatization of m-cresol, liberates afree amine. This free amine subsequently serves as the nucleophile toattack the next vinyloxycarbonate, propagating the polymerization of thegrowing carbamate chain. Such a strategy minimizes the potential forcross-reactivity and bi-directional polymerization by ensuring that onlyone nucleophile is present at any time during polymerization.

Using the monomer described above, artificial translation of DNA into apolycarbamate can be viewed as a three-stage process. In the firststage, single stranded DNA templates encoding the library are used toguide the assembly and polymerization of the dinucleotide moieties ofthe monomers, terminating with the “stop” monomer which possesses a 3′methyl ether instead of a 3′ hydroxyl group (FIG. 22).

Once the nucleotides have assembled and polymerized into double-strandedDNA, the “start” monomer ending in a o-nitrobenzylcarbamates isphotodeprotected to reveal the primary amine that initiates carbamatepolymerization. Polymerization proceeds in the 5′ to 3′ direction alongthe DNA backbone, with each nucleophilic attack resulting in thesubsequent unmasking of a new amine nucleophile. Attack of the “stop”monomer liberates an acetamide rather than an amine, thereby,termination polymerization (FIG. 23). Because the DNA at this stageexists in a stable double-stranded variables such as temperature and pHmay be explored to optimize polymerization efficiency.

Following polymerization the polycarbamate is cleaved from the phosphatebackbone of the DNA upon treatment with fluoride. Desilylation of theenol ether linker and the elimination of the phosphate driven by theresulting release of phenol provides the polycarbamate covalently linkedat its carboxy terminus to its encoding single-stranded DNA (FIG. 24).

At this stage the polycarbamate may be completely liberated from the DNAby base hydrolysis of the ester linkage. The liberated polycarbamate canbe purified by HPLC and retested to verify that its desired propertiesare intact. The free DNA can be amplified using PCR, mutated witherror-prone PCR (Cadwell et al. PCR Methods Appl. 2:28, 1992;incorporated herein by reference) or DNA shuffling (Stemmer Proc. Natl.Acad. Sci. USA 91:10747, 1994; Stemmer Nature 370:389, 1994; U.S. Pat.No. 5,811,238, issued Sep. 22, 1998; each of which is incorporatedherein by reference), and/or sequenced to reveal the primary structureof the polycarbamate.

Synthesis of monomer units. After the monomers are synthesized, theassembly and polymerization of the monomers on the DNA scaffold shouldoccur spontaneously. Shikimic acid 1, available commercially,biosynthetically (Davis Adv. Enzymol. 16:287, 1955; incorporated hereinby reference), or by short syntheses from D-mannose (Fleet et al. J.Chem. Soc., Perkins Trans. 1905, 1984; Harvey et al. Tetrahedron Lett.32:4111, 1991; each of which is incorporated herein by reference),serves as a convenient starting point for the monomer synthesis. The synhydroxyl groups are protected as the p-methoxybenzylidene, and remaininghydroxyl group as the tert-butyldimethylsilyl ether to afford 2. Thecarboxylate moiety of the protected shikimic acid is then reducedcompletely by LAH reduction, tosylation of the resulting alcohol, andfurther reduction with LAH to provide 3.

Commercially available and synthetically accessible N-protected aminoacids serve as the starting materials for the dicarbamate moiety of eachmonomer. Reactive side chains are protected as photolabile ethers,esters, acetals, carbamates, or thioethers. Following chemistrypreviously developed (Cho et al. Science 261:1303, 1993; incorporatedherein by reference), a desired amino acid 4 is converted to thecorresponding amino alcohol 5 by mixed anhydride formation withisobutylchloroformate followed by reduction with sodium borohydride. Theamino alcohol is then converted to the activated carbonate by treatmentwith p-nitrophenylchloroformate to afford 6, which is then coupled to asecond amino alcohol 7 to provide, following hydroxyl group silylationand FMOC deprotection, carbamate 8.

Coupling of carbamate 8 onto the shikimic acid-derived linker proceedsas follows. The allylic hydroxyl group of 3 is deprotected with TBAF,treated with triflic anhydride to form the secondary triflate, thendisplaced with aminocarbamate 8 to afford 9. Presence of the vinylicmethyl group in 3 should assist in minimizing the amount of undesiredproduct resulting from S_(N)2′ addition (Magid Tetrahedron 36:1901,1980; incorporated herein by reference). Michael additions ofdeprotonated carbamates to α,β-unsaturated esters have been welldocumented (Collado et al. Tetrahedron Lett. 35:8037, 1994; Hirama etal. J. Am. Chem. Soc. 107:1797, 1985; Nagasaka et al. Heterocycles29:155, 1989; Shishido et al. J. Chem. Soc. Perkins Trans. I 993, 1987;Hirama et al. Heterocycles 28:1229, 1989; each of which is incorporatedherein by reference). By analogy, the secondary amine is protected asthe o-nitrobenzyl carbamate (NBOC), and the resulting compound isdeprotonated at the carbamate nitrogen. This deprotonation can typicallybe performed with either sodium hydride or potassium tert-butyloxide(Collado et al. Tetrahedron Lett. 35:8037, 1994; Hirama et al. J. Am.Chem. Soc. 107:1797, 1985; Nagasaka et al. Heterocycles 29:155, 1989;Shishido et al. J. Chem. Soc. Perkins Trans. I 993, 1987; Hirama et al.Heterocycles 28:1229, 1989; each of which is incorporated herein byreference), although other bases may be utilized to minimizedeprotonation of the nitrobenzylic protons. Additions of thedeprotonated carbamate to α,β-unsaturated ketone 10, followed bytrapping of the resulting enolate with TBSCl, should afford silyl enolether 11. The previously found stereoselectivity of conjugate additionsto 5-substituted enones such as 10 (House et al. J. Org. Chem. 33:949,1968; Still et al. Tetrahedron 37:3981, 1981; each of which isincorporated herein by reference) suggests that preferential formationof 11 over its diastereomer. Ketone 10, the precursor to thefluoride-cleavable carbamate-phosphate linker, may be synthesized from 2by one pot decarboxylation (Barton et al. Tetrahedron 41:3901, 1985;incorporated herein by reference) followed by treatment with TBAF, Swernoxidation of the resulting alcohol to afford 12, deprotection with DDQ,selective nitrobenzyl ether formation of the less-hindered alcohol, andreduction of the α-hydroxyl group with samarium iodide (Molander InOrganic Reactions, Paquette, Ed. 46:211, 1994; incorporated herein byreference).

The p-methoxybenzylidene group of 11 is transformed into the α-hydroxyPMB ether using sodium cyanoborohydride and TMS chloride (Johansson etal. J. Chem. Soc. Perkin Trans. I 2371, 1984; incorporated herein byreference) and the TES group deprotected with 2% HF (conditions thatshould not affect the TBS ether (Boschelli et al. Tetrahedron Lett.26:5239, 1985; incorporated herein by reference)) to provide 13. The PMBgroup, following precedent (Johansson et al. J. Chem. Soc. Perkin Trans.12371, 1984; Sutherlin et al. Tetrahedron Lett. 34:4897, 1993; each ofwhich is incorporated herein by reference), should remain on the morehindered secondary alcohol. The two free hydroxyl groups may bemacrocyclized by very slow addition of 13 to a solution of p-nitrophenylchloroformate (or another phosgene analog), providing 14. The PMB etheris deprotected, and the resulting alcohol is converted into a triflateand eliminated under kinetic conditions with a sterically hindered baseto afford vinyloxycarbonate 15. Photodeprotection of the nitrobenzyleither and nitrobenzyl carbamate yields alcohol 16.

The monomer synthesis is completed by the sequential coupling of threecomponents. Chlorodiisopropylaminophosphine 17 is synthesized by thereaction of PCl₃ with diisopropylamine (King et al. J. Org. Chem.49:1784, 1984; incorporated herein by reference). Resin-bound (or3′-o-nitrobenzylether protected) nucleoside 18 is coupled to 17 toafford phosphoramidite 19. Subsequent coupling of 19 with the nucleoside20 (Inoue et al. J. Am. Chem. Soc. 103:7666, 1981; incorporated hereinby reference) provides 21. Alcohol 16 is then reacted with 21 to yield,after careful oxidation using MCPBA or I₂ followed by cleavage from theresin (or photodeprotection), the completed monomer 22. This strategy ofsequential coupling of 17 with alcohols has been successfully used togenerate phosphates bearing three different alkoxy substituents inexcellent yields (Bannwarth et al. Helv. Chim. Acta 70:175, 1987;incorporated herein by reference).

The unique start and stop monomers used to initiate and terminatecarbamate polymerization may be synthesized by simple modification ofthe above scheme.

B) Evolvable Functionalized Peptide-Nucleic Acids (PNAs): In anotherembodiment an amplifiable peptide-nucleic acid library is created. Orgeland co-workers have demonstrated that peptide-nucleic acid (PNAs)oligomers are capable of efficient polymerization on complementary DNAor RNA templates (Böhler et al. Nature 376:578, 1995; Schmidt et al.Nucl. Acids Res. 25:4792, 1997; each of which is incorporated herein byreference). This finding, together with the recent synthesis andcharacterization of chiral peptide nucleic acids bearing amino acid sidechains (Haaima et al. Angew. Chem. Int. Ed. Engl. 35:1939-1942, 1996;Püschl et al. Tetrahedron Lett. 39:4707, 1998; each of which isincorporated herein by reference), allows the union of the polymerbackbone and the growing nucleic acid strand into a single structure. Inthis example, each template consists of a DNA hairpin terminating in a5′ amino group; the solid-phase and solution syntheses of such moleculeshave been previously described (Uhlmann et al. Angew. Chem. Int. Ed.Engl. 35:2632, 1996; incorporated herein by reference). Each extensionmonomer consists of a PNA trimer (or longer) bearing side chainscontaining functionality of interest. An artificial genetic code iswritten to assign each trinucleotide to a different set of side chains.Assembly, activation (with a carbodiimide and appropriate leaving group,for example), and polymerization of the PNA dimers along thecomplementary DNA template in the carboxy- to amino-terminal directionaffords the unnatural polymer (FIG. 20). Choosing a “stop” monomer witha biotinylated N-terminus provides a convenient way of terminating theextension and purifying full-length polymers. The resulting polymers,covalently linked to their encoding DNA, are ready for selection,sequencing, or mutation.

The experimental approach towards implementing an evolvablefunctionalized peptide nucleic acid library comprises (i) improving andadapting known chemistry for the high efficiency template-directedpolymerization of PNAs; (ii) defining a codon format (length andcomposition) suitable for PNA coupling of a number of diverse monomerson a complementary strand of encoding DNA free from significantinfidelity, frameshifting, or spurious initiation of polymerization;(iii) choosing an initial set of side chains defining our new geneticcode and synthesizing corresponding monomers; and (iv) subjecting alibrary of functionalized PNAs to cycles of selection, amplification,and mutation and characterizing the resulting evolved molecules tounderstand the basis of their novel activities.

(i) Improving coupling chemistry: While Orgel and coworkers havereported template-directed PNA polymerization, reported yields andnumber of successful couplings are significantly lower than would bedesired. A promising route towards improving this key coupling processis exploring new coupling reagents, temperatures, and solvents whichwere not previously investigated (presumably because previous effortsfocused on conditions which could have existed on prebiotic earth). Thedevelopment of evolvable functionalized PNA polymers involves employingactivators (DCC, DIC, EDC, HATU/DIEA, HBTU/DIEA, ByBOP/DIEA,chloroacetonitrile), leaving groups (2-methylimidazole, imidazole,pentafluorophenol, phenol, thiophenol, trifluoroacetate, acetate,toluenesulfonic acids, coenzyme A, DMAP, ribose), solvents (aqueous atseveral pH values, DMF, DMSO, chloroform, TFE), and temperature (0° C.,4° C., 25° C., 37° C., 55° C.) in a large combinatorial screen toisolate new coupling conditions. Each well of a 384-well plate isassigned a specific combination of one activator, leaving group,solvent, and temperature. Solid-phase synthesis beads covalently linkedto DNA hairpin templates are placed in each well, together with afluorescently labeled PNA monomer complementary to the template. Asuccessful coupling event results in the covalent linking of thefluorophore to the beads (FIG. 26); undesired non-templated coupling canbe distinguished by control reactions with mismatched monomers.Following bead washing and cleavage of the product from solid support,each well is assayed with a fluorescence plate reader.

(ii) Defining a codon format: While Nature has successfully employed atriplet codon in protein biosynthesis, a new polymer assembled undervery different conditions without the assistance of enzymes may requirean entirely novel codon format. Frameshifting may be remedied bylengthening each codon such that hybridizing a codon out of frameguarantees a mismatch (for example, by starting each codon with a G andby restricting subsequent positions in the codon to T, C, and A).Thermodynamically, one would also expect fidelity to improve as codonlength increases to a certain point. Codons that are excessively long,however, will be able to hybridize despite mismatched bases and moreovercomplicate monomer synthesis. An optimal codon length for high fidelityartificial translation can be defined using an optimized plate-basedcombinatorial screen developed above. The length and composition of eachcodon in the template is varied by solid-phase synthesis of theappropriate DNA hairpin. These template hairpins are then allowed tocouple with fluorescently labeled PNA monomers of varying sequence. Theideal codon format allows only monomers bearing exactly complementarysequences to couple with templates, even in the presence of mismatchedPNA monomers (which are labeled differently to facilitate assaying ofmatched versus mismatched coupling). Triplet and quadruplet codons inwhich two bases are varied among A, T, and C while the remaining base orbases are fixed as G to ensure proper registration during polymerizationare first studied.

(iii) Writing a new genetic code: Side chains are chosen which provideinteresting functionality not necessarily present in naturalbiopolymers, which are synthetically accessible, and which arecompatible with coupling conditions. For example, a simple genetic codewhich might be used to evolve a Ni⁺² chelating PNA consists of a varietyof protected carboxylate-hearing side chains as well as a set of smallside chains to equip polymers with conformational flexibility andstructural diversity (FIG. 27). Successful selection of PNAs capable ofbinding Ni⁺² with high affinity could be followed by an expansion ofthis genetic code to include a fluorophore as well as a fluorescentquencher. The resulting polymers could then be evolved towards afluorescent Ni⁺² sensor which possesses different fluorescent propertiesin the absence or presence of nickel. Replacing the fluorescent sidechain with a photocaged one may allow the evolution of polymers thatchelate Ni⁺² in the presence of certain wavelengths of light or whichrelease Ni⁺² upon photolysis. These simple examples demonstrate thetremendous flexibility in potential chemical properties of evolvableunnatural molecules conferred by the freedom to incorporate syntheticbuilding blocks no longer limited to those compatible with the ribosomalmachinery.

(iv) Selecting for desired unnatural polymers: Many of the methodsdeveloped for the selection of biological molecules can be applied toselections for evolved PNAs with desired properties. Like nucleic acidor phage-display selections, libraries of unnatural polymers generatedby the DNA-templated polymerization methods described above areself-tagged and therefore do not need to be spatially separated orsynthesized on pins or beads. Ni⁺² binding PNA may be done simply bypassing the entire library resulting from translation or a randomoligonucleotide through commercially available Ni-NTA (“His-Tag”) resinprecharged with nickel. Desired molecules bind to the resin and areeluted with EDTA. Sequencing these PNAs after several cycles ofselection, mutagenesis, and amplification reveals which of the initiallychosen side chains can assemble together to form a Ni⁺² receptor. Inaddition, the isolation of a PNA Ni⁺² chelator represents the PNAequivalent of a histidine tag which may prove useful for thepurification of subsequent unnatural polymers. Later efforts willinvolve more ambitious selections. For example, PNAs that fluoresce inthe presence of specific ligands may be selected by FACS sorting oftranslated polymers linked through their DNA templates to beads. Thosebeads that fluoresce in the presence, but not in the absence, of thetarget ligand are isolated and characterized. Finally, the use of abiotinylated “stop” monomer as described above allows for the directselection for the catalysis of many bond-forming or bond-breakingreactions. Two examples depicted in FIG. 28 outline a selection for afunctionalized PNA that catalyzes the retroaldol cleavage of fructose1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetonephosphate, an essential step in glycolysis, as well as a selection forPNA that catalyzes the converse aldol reaction.

C) Evolvable Libraries of Small Molecules: In yet another embodiment ofthe present invention, the inventive methods are used in preparingamplifiable and evolvable unnatural nonpolymeric molecules includingsynthetic drug scaffolds. Nucleophilic or electrophilic groups areindividually unmasked on a small molecule scaffold attached by simplecovalent linkage or through a common solid support to an encodingoligonucleotide template. Electrophilic or nucleophilic reactants linkedto short nucleic acid sequences are hybridized to the correspondingtemplates. Sequence-specific reaction with the appropriate reagent takesplace by proximity catalysis (FIG. 29).

Following synthetic functionalization of all positions in a mannerdetermined by the sequence of the attached DNA (FIGS. 30 & 31), theresulting encoded beads may be subjected to wide range of biologicalscreens which have been developed for assaying compounds on resin.(Gordon et al. J. Med. Chem. 37:1385, 1994; Gallop et al. J. Med. Chem.37:1233-1251, 1994; each of which is incorporated herein by reference)

Encoding DNA is cleaved from each bead identified in the screen andsubjected to PCR, mutagenesis, sequencing, or homologous recombinationbefore reattachment to a solid support. Ultimately, this system is mostflexible when the encoding DNA is directly linked to the combinatorialsynthetic scaffold without an intervening bead. In this case, entirelibraries of compounds may be screened or selected for desiredactivities, their encoding DNA liberated, amplified, mutated, andrecombined, and new compounds synthesized all in a small series ofone-pot, massively parallel reactions. Without a bead support, however,reactivities of hybridized reactants must be highly efficient since onlyone template molecule directs the synthesis of the entire smallmolecule.

The development of evolvable synthetic small molecule libraries relieson chemical catalysis provided by the proximity of DNA hybridizedreactants. It will be appreciated that acceptable distances betweenhybridized reactants and unmasked reactive groups must cast be definedfor efficient DNA-templated functionalization by hybridizingradiolabeled electrophiles (activated esters in out first attempts)attached to short oligonucleotides at varying distances from a reactivenucleophile (a primary amine) on a strand of DNA. At given timepoints,aliquots are subjected to gel electrophoresis and autoradiography tomonitor the course of the reaction. Plotting the reaction as a functionof the distance (in bases) between the nucleophile and electrophile willdefine an acceptable distance window within which proximity-basedcatalysis of a DNA-hybridized reaction can take place. The width of thiswindow will determine the number of distinct reactions we can encode ona strand of DNA (a larger window allows more reactions) as well as thenature of the codons (a larger window is required for longer codons)(FIG. 32).

Once acceptable distances between functional groups on a combinatorialsynthetic scaffold and hybridizes reactants is determined, the codonformat is determined. The nonpolymeric nature of small moleculesynthesis simplifies codon reading as frameshifting is not an issue andrelatively large codons may be used to ensure that each set of reactantshybridizes only to one region of the encoding DNA strand.

Once the distance of the linker between the functional group andsynthetic small molecule scaffold and the codon format have beendetermined, one can synthesize small molecules based on a small moleculescaffold such as the cephalosporin scaffold shown in FIG. 31. Theprimary amine of 7-aminocephalosporanic acid is first protected usingFMOC-Cl, and then the acetyl group is hydrolyzed by treatment with base.The encoding DNA template is then attached through an amide linkageusing EDC and HOBt to the carboxylic acid group. A transfer moleculewith an anti-codon capable of hybridizing to the attached DNA templateis then allowed to hybridize to the template. The transfer molecule hasassociated with it through a disulfide linkage a primary amine bearingR₁. Activation of the primary hydroxyl group on the cephalosporinscaffold with DSC following treatment with TCEP affords the aminecovalently attached to the scaffold through a carbamate linkage. Furthertreatment with another transfer unit having an amino acid leads tofunctionalization of the deprotected primary amine of the cephalosporinscaffold. Cephalosporin-like molecules synthesized in this manner maythen be selected, amplified, and/or evolved using the inventive methodsand compositions. The DNA template may be diversified and evolved usingDNA shuffling.

D) Multi-Step Small Molecule Synthesis Programmed by DNA Templates:Molecular evolution requires the sequence-specific translation of anamplifiable information carrier into the structures of the evolvingmolecules. This requirement has limited the types of molecules that havebeen directly evolved to two classes, proteins and nucleic acids,because only these classes of molecules can be translated from nucleicacid sequences. As described generally above, a promising approach tothe evolution of molecules other than proteins and nucleic acids usesDNA-templated synthesis as a method of translating DNA sequences intosynthetic small molecules. DNA-templated synthesis can direct a widevariety of powerful chemical reactions with high sequence-specificityand without requiring structural mimicry of the DNA backbone. Theapplication of this approach to synthetic molecules of usefulcomplexity, however, requires the development of general methods toenable the product of a DNA-templated reaction to undergo subsequentDNA-templated transformations. The first DNA-templated multi-step smallmolecule syntheses is described in detail herein. Together with recentadvances in the reaction scope of DNA-templated synthesis, thesefindings set the stage for the in vitro evolution of synthetic smallmolecule libraries.

Multi-step DNA-templated small molecule synthesis faces two majorchallenges beyond those associated with DNA-templated synthesis ingeneral. First, the DNA used to direct reagents to appropriate templatesmust be removed from the product of a DNA-templated reaction prior tosubsequent DNA-templated synthetic steps in order to prevent undesiredhybridization to the template. Second, multi-step synthesis oftenrequires the purification and isolation of intermediate products, yetcommon methods used to purify and isolate reaction products are notappropriate for multi-step synthesis on the molecular biology scale. Toaddress these challenges, three distinct strategies were implemented insolid-phase organic synthesis, for linking chemical reagents with theirdecoding DNA oligonucleotides and two general approaches for productpurification after any DNA-templated synthetic step were developed.

When possible, an ideal reagent-oligonucleotide linker for DNA-templatedsynthesis positions the oligonucleotide as a leaving group of thereagent. Under this “autocleaving” linker strategy, theoligonucleotide-reagent bond is cleaved as a natural chemicalconsequence of the reaction (FIG. 33). As the first example of thisapproach applied to DNA-templated chemistry, a dansylated Wittigphosphorane reagent (1) was synthesized in which the decoding DNAoligonucleotide was attached to one of the aryl phosphine groups (I.Hughes, Tetrahedron Lett. 1996, 37, 7595). DNA-templated Wittigolefination (as described above) with aldehyde-linked template 2resulted in the efficient transfer of the fluorescent dansyl group fromthe reagent to the template to provide olefin 3 (FIG. 33). As a secondexample of an autocleaving linker, DNA-linked thioester 4, whenactivated with Ag(I) at pH 7.0 (Zhang et al. J. Am. Chem. Soc. 1999,121, 3311) acylated amino-terminated template 5 to afford amide product6 (FIG. 33). Ribosomal protein biosynthesis uses aminoacylated tRNAs ina similar autocleaving linker format to mediate RNA-templated peptidebond formation. To purify desired products away from unreacted reagentsand from cleaved oligonucleotides following DNA-templated reactionsusing autocleaving linkers, biotinylated reagent oligonucleotides andwashing crude reactions with streptavidin-linked magnetic beads (FIG.34) were utilized. Although this approach does not separate reactedtemplates from unreacted templates, unreacted templates can be removedin subsequent DNA-templated reaction and purification steps (see below).

Reagents bearing more than one functional group can be linked to theirdecoding DNA oligonucleotides through a second and third linkerstrategies. In the “scarless linker” approach, one functional group ofthe reagent is reserved for DNA-templated bond formation, while thesecond functional group is used to attach a linker that can be cleavedwithout introducing additional unwanted chemical functionality.DNA-templated reaction is followed by cleavage of the linker attachedthrough the second functional group to afford desired products (FIG.33). For example, a series of aminoacylation reagents such as (D)-Phederivative 7 were synthesized in which the α-amine is connected througha carbamoylethylsulfone linker (Zarling et al. J. Immunology 1980, 124,913) to its decoding DNA oligonucleotide. The product (8) ofDNA-templated amide bond formation (as described herein) using thisreagent and an amine-terminated template (5) was treated with aqueousbase to effect the quantitative elimination and spontaneousdecarboxylation of the linker, affording product 9 containing thecleanly transferred amino acid group (FIG. 33). This sulfone linker isstable in pH 7.5 or lower buffer at 25° C. for more than 24 h yetundergoes quantitative cleavage when exposed to pH 11.8 buffer for 2 hat 37° C.

In some cases it may be advantageous to introduce new chemical groups asa consequence of linker cleavage. Under a third linker strategy, linkercleavage generates a “useful scar” that can be functionalized insubsequent steps. As an example of this class of linker, amino acidreagents such as the (L)-Phe derivative 10 were generated linked through1,2-diols (Fruchart et al Tetrahedron Lett. 1999, 40, 6225) to theirdecoding DNA oligonucleotides. Following DNA-templated amide bondformation with amine terminated template (5), this linker wasquantitatively cleaved by oxidation with 50 mM aqueous NaIO₄ at pH 5.0to afford product 12 containing an aldehyde group appropriate forsubsequent functionalization (for example, in a DNA-templated Wittigolefination, reductive amination, or nitrolaldol addition (FIG. 33).

Desired products generated from DNA-templated reactions using thescarless or useful scar linkers can be readily purified usingbiotinylated reagent oligonucleotides (FIG. 34). Reagentoligonucleotides together with desired products are first captured onstreptavidin-linked magnetic beads. Any unreacted template bound toreagent by base pairing is removed by washing the beads with buffercontaining 4 M guanidinium chloride. Biotinylated molecules remain boundto the streptavidin beads under these conditions. Desired product isthen isolated in pure form by eluting the beads with linker cleavagebuffer (in the examples above, either pH 11 or NaIO₄-containing buffer),while reacted and unreacted reagents remain bound to the beads.

Integrating the recently expanded repertoire of synthetic reactionscompatible with DNA-templated synthesis and the linker strategiesdescribed above, multi-step DNA-templated small molecule syntheses canbe conducted.

In one embodiment, a solution phase DNA-templated synthesis of anon-natural peptide library is described generally below and is showngenerally in FIG. 35. As shown in FIG. 35, to generate the initialtemplate pool for the library, thirty synthetic biotinylated 5′-aminooligonucleotides of the sequence format shown in FIG. 35 are acylatedwith one of thirty different natural or unnatural amino acids usingstandard EDC coupling procedures. Four bases representing a “codon”within each amino acylated primer are designated the identity of theside chain (R₁). The “genetic code” for these side chains are protectedwith acid labile protecting groups similar to those commonly used inpeptide synthesis. The resulting thirty amino acylated DNA primers areannealed to a template DNA oligonucleotide library generated byautomated DNA synthesis. Primer extension with a DNA polymerase followedby strand denaturation and purification with streptavidin-linkedmagnetic beads yield the starting template library (see, FIG. 35). Asbut one general example, a solution phase DNA-templated synthesis of anon-natural peptide library is depicted in FIG. 36. The template libraryis subjected to three DNA-templated peptide bond formation reactionsusing amino acid reagents attached to 10-base decoding DNAoligonucleotides through the sulfone linker described above. Products ofeach step are purified by preparative denaturing polyacrylamide gelelectrophoresis prior to linker cleavage if desired, although this maynot be necessary. Each DNA-linked reagent can be synthesized by couplinga 3′-amino terminated DNA oligonucleotide to the encoded amino acidthrough the bis-NHS carbonate derivative of the sulfone linker as shownin FIG. 37. Codons are again chosen so that related codons encodechemically similar amino acids. Following each peptide bond formationreaction, acetic anhydride is used to cap unreacted starting materialsand pH 11 buffer is used to effect linker cleavage to expose a new aminogroup for the next peptide bond formation reaction. Once thetetrapeptide is completed, those library members bearing carboxylateside chains can also be cyclized with their amino termini to formmacrocyclic peptides, while linear peptide members can simply beN-acetylated (see FIG. 36).

It will be appreciated that a virtually unlimited assortment of aminoacid building blocks can be incorporated into a non-natural peptidelibrary. Unlike peptide libraries generated using the proteinbiosynthetic machinery such as phage displayed libraries (O'Neil et al.Curr. Opin. Struct. Biol. 1995, 5, 443-9), mRNA displayed libraries(Roberts et al. Proc. Natl. Acad. Sci, USA 1997, 94, 12297-12302)ribosome displayed libraries (Roberts et al. Curr. Opin. Chem. Biol.1999, 3, 268-73; Schaffitzel et al. J. Immunol Methods 1999, 231,119-35), or intracellular peptide libraries (Norman et al. Science 1999,285, 591-5), amino acids with non-proteinogenic side chains, non-naturalside chain stereochemistry, or non-peptidic backbones can all beincorporated into this library. In addition, the many commerciallyavailable di-, tri- and oligopeptides can also be used as buildingblocks to generate longer library members. The presence of non-naturalpeptides in this library may confer enhanced pharmacological propertiessuch as protease resistance compared with peptides generatedribosomally. Similarly, the macrocyclic library members may yield higheraffinity ligands since the entropy loss upon binding their targets maybe less than their more flexible linear counterparts. Based on theenormous variety of commercially available amino acids fitting thesedescriptions, the maximum diversity of this non-natural cyclic andlinear tetrapeptide library can exceed 100×100×100×100=10⁸ members.

Another example of a library using the approach described above includesthe DNA-templated synthesis of a diversity-oriented macrobicycliclibrary containing 5- and 14-membered rings (FIG. 38). Starting materialfor this library consists of DNA templates aminoacylated with a varietyof side-chain protected lysine derivatives and commercially availablelysine analogs (including aminoethyl-cysteine, aminoethylserine, and4-hydroxylysine). In the first step, DNA-templated amide bond formationwith a variety of DNA-linked amino acids takes place as described in thenon-natural peptide library, except that the vicinal diol linker 16described above is used instead of the traceless sulfone linker.Following amide bond formation, the diol linker is oxidatively cleavedwith sodium periodate. Deprotection of the lysine analog side chainamine is followed by DNA-templated amide bond formation catalyzed bysilver trifluoroacetate between the free amine and a library of acrylicderived thioesters. The resulting olefins are treated with ahydroxylamine to form nitrones, which undergo 1,3-dipolar cycloadditionto yield the bicyclic library (FIG. 38). DNA-linked reagents for thislibrary are prepared by coupling lysine analogs to 5′-amino-terminatedtemplate primers (FIG. 35), coupling aminoacylated diol linkers to3′-amino-terminated DNA oligonucleotides (FIG. 38), and coupling acrylicacids to 3′-thiol terminated DNA oligonucleotides (FIG. 38).

As but one example of a specific library generated from the firstgeneral approach described above, three iterated cycles of DNA-templatedamide formation, traceless linker cleavage, and purification withstreptavidin-linked beads were used to generate a non-natural tripeptide(FIG. 39). Each amino acid reagent was linked to a unique biotinylated10-base DNA oligonucleotide through the sulfone linker described above.The 30-base amine-terminated template programmed to direct thetripeptide synthesis contained three consecutive 10-base regions thatwere complementary to the three reagents, mimicking the strategy thatwould be used in a multi-step DNA-templated small molecule librarysynthesis. The first amino acid reagent (13) was combined with thetemplate and activator4-(4,6-dimethoxy-1,3,5-triazin-2-yl)-4-methylmorpholinium chloride(DMT-MM) (Kunishima et al. Tetrahedron 2001, 57, 1551) to effectDNA-templated peptide bond formation. The desired product was purifiedby mixing the crude reaction with streptavidin-linked magnetic beads,washing with 4 M guanidinium chloride, and eluting with pH 11 buffer toeffect sulfone linker cleavage, providing product 14. The free aminegroup in 14 was then elaborated in a second and third round ofDNA-templated amide formation and linker cleavage to afford dipeptide 15and tripeptide 16 (FIG. 39).

The progress of each reaction, purification, and sulfone linker cleavagestep was followed by denaturing polyacrylamide gel electrophoresis. Thefinal tripeptide linked to template (16) was digested with therestriction endonuclease EcoRI and the digestion fragment containing thetripeptide was characterized by MALDI mass spectrometry. Beginning with2 nmol (˜20 μg) of starting material, sufficient tripeptide product wasgenerated to serve as the template for more than 10⁶ in vitro selectionsand PCR reactions (Kramer et al. in Current Protocols in MolecularBiology, Vol 3 (Ed.: F. M. Ausubel), Wiley, 1999, pp. 15.1) (assuming1/10,000 molecules survive selection). No significant product wasgenerated when the starting material template was capped with aceticanhydride, or when control reagents containing sequence mismatches wereused instead of the complementary reagents (FIG. 39).

A non-peptidic multi-step DNA-templated small molecule synthesis (FIG.40) that uses all three linker strategies developed above was alsoperformed. An amine-terminated 30-base template was subjected toDNA-templated amide bond formation using an aminoacyl donor reagent (17)containing the diol linker and a biotinylated 10-base oligonucleotide toafford amide 18. The desired product was isolated by capturing the crudereaction on streptavidin beads followed by cleaving the linker withNaIO₄ to generate aldehyde 19. The DNA-templated Wittig reaction of 19with the biotinylated autocleaving phosphorane reagent 20 affordedfumaramide 21. The products from the second DNA-templated reaction werepartially purified by washing with streptavidin beads to remove reactedand unreacted reagent. In the third DNA-templated step, fumaramide 21was subjected to a DNA-templated conjugate addition (Gartner et al. J.Am. Chem. Soc. 2001, 123, 6961) using thiol reagent 22 linked throughthe sulfone linker to a biotinylated oligonucleotide. The desiredconjugate addition product (23) was purified by immobilization withstreptavidin beads. Linker cleavage with pH 11 buffer afforded finalproduct 24 in 5-10% overall isolated yield for the three bond formingreactions, two linker cleavage steps, and three purifications (FIG. 40).This final product was digested with EcoRI and the mass of the smallmolecule-linked template fragment was confirmed by MALDI massspectrometry (exact mass: 2568, observed mass: 2566±5). As in thetripeptide example, each of the three reagents used during thismulti-step synthesis annealed at a unique location on the DNA template,and control reactions with sequence mismatches yielded no product (FIG.40). As expected, control reactions in which the Wittig reagent wasomitted (step 2) also did not generate product following the third step.Taken together, the DNA-templated syntheses of 16 and 24 demonstrate theability of DNA to direct the sequence-programmed multi-step synthesis ofboth oligomeric and non-oligomeric small molecules unrelated instructure to nucleic acids.

The commercial availability of many substrates for DNA-templatedreactions including amines, carboxylic acids, α-halo carbonyl compounds,olefins, alkoxyamines, aldehydes, and nitroalkanes may allow thetranslation of large libraries of DNA into diverse small moleculelibraries. The direct one-pot selection of these libraries for memberswith desired binding or catalytic activities, followed by the PCRamplification and diversification of the DNA encoding active molecules,may enable synthetic small molecules to evolve in a manner parallelingthe powerful methods Nature uses to generate new molecular function. Inaddition, multi-step nucleic acid-templated synthesis is a requirementof previously proposed models (A. I. Scott, Tetrahedron Lett. 1997, 38,4961; Li et al. Nature 1994, 369, 218; Tamura et al. Proc. Natl. Acad.Sci USA 2001, 98, 1393) for the prebiotic translation of replicableinformation into functional molecules. These findings demonstrate thatnucleic acid templates are indeed capable of directing iterative ornon-iterative multi-step small molecule synthesis even when reagentsanneal at widely varying distances from the growing molecule (in theabove examples, zero to twenty bases). As described in more detailbelow, libraries of synthetic molecules can then be evolved towardsactive ligand and catalysts through cycles of translation, selection,amplification and mutagenesis.

E) Evolving Plastics: In yet another embodiment of the presentinvention, a nucleic acid (e.g., DNA, RNA, derivative thereof) isattached to a polymerization catalyst. Since nucleic acids can fold intocomplex structures, the nucleic acid can be used to direct and/or affectthe polymerization of a growing polymer chain. For example, the nucleicacid may influence the selection of monomer units to be polymerized aswell as how the polymerization reaction takes place (e.g.,stereochemistry, tacticity, activity). The synthesized polymers may beselected for specific properties such molecular, weight, density,hydrophobicity, tacticity, stereoselectivity, etc., and the nucleic acidwhich formed an integral part of the catalyst which directed itssynthesis may be amplified and evolved (FIG. 41). Iterated cycles ofligand diversification, selection, and amplification allow for the trueevolution of catalysts and polymers towards desired properties.

To give but one example, a library of DNA molecules is attached toGrubbs' ruthenium-based ring opening metathesis polymerization (ROMP)catalyst through a dihydroimidazole ligand (Scholl et al. Org. Lett.1(6):953, 1999; incorporated herein by reference) creating a large,diverse pool of potential catalytic molecules, each unique by nature ofthe functionalized ligand. Undoubtedly, functionalizing the catalystwith a relatively large DNA-dihydroimidazole (DNA-DHI) ligand will alterthe activity of the catalyst. Each DNA molecule has the potential tofold into a unique stereoelectronic shape which potentially hasdifferent selectivities and/or activities in the polymerization reaction(FIG. 42). Therefore, the library of DNA ligands can be “translated”into a library of plastics upon the addition of various monomers. Incertain embodiments, DNA-DHI ligands capable of covalently insertingthemselves into the growing polymer, thus creating a polymer tagged withthe DNA that encoded its creation, are used. Using the synthetic schemeshown in FIG. 42, DHI ligands are produced containing two chemicalhandles, one used to attach the DNA to the ligand, the other used toattach a pedant olefin to the DHI backbone. Rates of metathesis areknown to vary widely based upon olefin substitution as well as theidentity of the catalyst. Through alteration of these variable, the rateof pendant olefin incorporation can be modulated such thatk_(pendant olefin metathesis)<<k_(ROMP), thereby, allowing polymers ofmoderate to high molecular weights to be formed before insertion of theDNA tag and corresponding polymer termination. Vinylic either arecommonly used in ROMP to functionalize the polymer termini (Gordon etal. Chem. Biol. 7:9-16, 2000; incorporated herein by reference), as wellas produce polymers of decreased molecular weight.

Subsequent selection of a polymer from the library based on a desiredproperty by electrophoresis, gel filtration, centrifugal sedimentation,partitioning into solvents of different hydrophobicities, etc.Amplification and diversification of the coding nucleic acid viatechniques such as error-prone PCR or DNA shuffling followed byattachment to a DHI backbone will allow for production of another poolof potential ROMP catalysts enriched in the selected activity (FIG. 43).This method provides a new approach to generating polymeric materialsand the catalysts that create them.

Example 6

Characterization of DNA-Templated Synthetic Small Molecule Libraries:The non-natural peptide and bicyclic libraries described above arecharacterized in several stages. Each candidate reagent is conjugated toits decoding DNA oligonucleotide, then subjected to model reactions withmatched and mismatched templates. The products from these reactions areanalyzed by denaturing polyacrylamide gel electrophoresis to assessreaction efficiency, and by mass spectrometry to verify anticipatedproduct structures. Once a complete set of robust reagents areidentified, the complete multi-step DNA-templated syntheses ofrepresentative single library members on a large scale is performed andthe final products are characterized by mass spectrometry.

More specifically, the sequence fidelity of each multi-stepDNA-templated library synthesis is tested by following the fate ofsingle chemically labeled reagents through the course of one-pot librarysynthesis reactions. For example, products arising from building blocksbearing a ketone group are captured with commercially availablehydrazide-linked resin and analyzed by DNA sequencing to verify sequencefidelity during DNA-templated synthesis. Similarly, when usingnon-biotinylated model templates, building blocks bearing biotin groupsare purified after DNA-templated synthesis using streptavidin magneticbeads and subjected to DNA sequencing (Liu et al. J. Am. Chem. Soc.2001, 123, 6961-6963) Codons that show a greater propensity to annealwith mismatched DNA are identified by screening in this manner andremoved from the genetic code of these synthetic libraries.

Example 7

In Vitro Selection of Protein Ligands from Evolvable SyntheticLibraries: Because every library member generated in this approach iscovalently linked to a DNA oligonucleotide that encodes and directs itssynthesis, libraries can be subjected to true in vitro selections.Although direct selections for small molecule catalysts of bond-formingor bond-cleaving reactions are an exciting potential application of thisapproach, the simplest in vitro selection that can be used to evolvethese libraries is a selection for binding to a target protein. An idealinitial target protein for the synthetic library selection both plays animportant biological role and possesses known ligands of varyingaffinities for validating the selection methods.

One receptor of special interest for use in the present invention is theα_(V)β₃ receptor. The α_(V)β₃ receptor is a member of the integrinfamily of transmembrane heterodimeric glycoprotein receptors (Miller etal. Drug Discov Today 2000, 5, 397-408; Berman et al. Membr Cell Biol.2000, 13, 207-44) The α_(V)β₃ integrin receptor is expressed on thesurface of many cell types such as osteoclasts, vascular smooth musclecells, endothelial cells, and some tumor cells. This receptor mediatesseveral important biological processes including adhesion of osteoclaststo the bone matrix (van der Pluijm et al. J. Bone Miner. Res. 1994, 9,1021-8) smooth muscle cell migration (Choi et al. J. Vasc. Surg. 1994,19, 125-34) and tumor-induced angiogenesis (Brooks et al. Cell 1994, 79,1157-64) (the outgrowth of new blood vessels). During tumor-inducedangiogenesis, invasive endothelial cells bind to extracellular matrixcomponents through their α_(V)β₃ integrin receptors. Several studies(Brooks et al. Cell 1994, 79, 1157-64; Brooks et al. Cell 1998, 92,391-400; Friedlander et al. Science 1995, 270, 1500-2; Varner et al.Cell Adhes Commun 1995, 3, 367-74; Brooks et al. J. Clin Invest 1995,96, 1815-22) have demonstrated that the inhibition of this integrinbinding event with antibodies or small synthetic peptides inducesapoptosis of the proliferative angiogenic vascular cells and can inhibittumor metastasis.

A number of peptide ligands of varying affinities and selectivities forthe α_(V)β₃ integrin receptor have been reported. Two benchmark α_(V)β₃integrin antagonists are the linear peptide GRGDSPK (IC₅₀=210 nM(Dechantsreiter et al. J. Med. Chem. 1999, 42, 3033-40; Pfaff et al. J.Biol. Chem. 1994, 269, 20233-8) and the cyclic peptide cyclo-RGDfV(Pfaff et al. J. Biol. Chem. 1994, 269, 20233-8) (f=(D)-Phe, IC₅₀=10nM). While peptides antagonists for integrins commonly contain RGD, notall RGD-containing peptides are high affinity integrin ligands. Rather,the conformational context of RGD and other peptide sequences can have aprofound effect on integrin affinity and specificity (Wermuth et al. J.Am. Chem. Soc. 1997, 119, 1328-1335; Geyer et al. J. Am. Chem. Soc.1994, 116, 7735-7743; Rai et al. Bioorg. Med. Chem. Lett 2001, 11,1797-800; Rai et al. Curr. Med. Chem. 2001, 8, 101-19) For this reason,combinatorial approaches towards α_(V)β₃ integrin receptor antagonistdiscovery are especially promising.

The biologically important and medicinally relevant role of the α_(V)β₃integrin receptor together with its known peptide antagonists and itscommercial availability (Chemicon International, Inc., Temecula, Calif.)make the α_(V)β₃ integrin receptor an ideal initial target forDNA-templated synthetic small molecule libraries. The α_(V)β₃ integrinreceptor can be immobilized by adsorption onto microtiter plate wellswithout impairing its ligand binding ability or specificity(Dechantsreiter et al. J. Med. Chem. 1999, 42, 3033-40; Wermuth et al.J. Am. Chem. Soc. 1997, 119, 1328-1335; Haubner et al. J. Am. Chem. Soc.1996, 118, 7461-7472). Alternatively, the receptor can be immobilized byconjugation with NHS ester or maleimide groups covalently linked tosepharose beads and the ability of the resulting integrin affinity resinto maintain known ligand binding properties can be verified.

To perform the actual protein binding selections, DNA template-linkedsynthetic peptide or macrocyclic libraries are dissolved in aqueousbinding buffer in one pot and equilibrated in the presence ofimmobilized α_(V)β₃ integrin receptor. Non-binders are washed away withbuffer. Those molecules that may be binding through their attached DNAtemplates rather than through their synthetic moieties are eliminated bywashing the bound library with unfunctionalized DNA templates lackingPCR primer binding sites. Remaining ligands bound to the immobilizedα_(V)β₃ integrin receptor are eluted by denaturation or by the additionof excess high affinity RGD-containing peptide ligand. The DNA templatesthat encode and direct the syntheses of α_(V)β₃ integrin binders areamplified by PCR using one primer designed to bind to a constant 3′region of the template and one pool of biotinylated primersfunctionalized at its 5′ end with the library starting materials (FIG.44). Purification of the biotinylated strand completes one cycle ofsynthetic molecule translation, selection, and amplification, yielding asub-population of DNA templates enriched in sequences that encodesynthetic α_(V)β₃ integrin ligands.

For reasons similar to those that make the α_(V)β₃ integrin receptor anattractive initial target for the approach to generating syntheticmolecules with desired properties, the factor Xa serine protease alsoserves as a promising protein target. Blood coagulation involves acomplex cascade of enzyme-catalyzed reactions that ultimately generatefibrin, the basis of blood clots (Rai et al. Curr. Med. Chem. 2001, 8,101-109; Vacca et al. Curr. Opin. Chem Biol. 2000, 4, 394-400) Thrombinis the serine protease that converts fibrinogen into fibrin during bloodclotting. Thrombin, in turn, is generated by the proteolytic action offactor Xa on prothrombin. Because thromboembolitic (blood clotting)diseases such as stroke remain a leading cause of death in the world(Vacca et al. Curr. Opin. Chem. Biol. 2000, 4, 394-400) the developmentof drugs that inhibit thrombin or factor Xa is a major area ofpharmaceutical research. The inhibition of factor Xa is a newer approachthought to avoid the side effects associated with inhibiting thrombin,which is also involved in normal hemostasis (Maignan et al. J. Med.Chem. 2000, 43, 3226-32; Leadley et al. J. Cardiovasc. Pharmacol. 1999,34, 791-9; Becker et al. Bioorg. Chem. Lett. 1999, 9, 2753-8;Choi-Sledeski et al. Bioorg. Med. Chem. Lett. 1999, 9, 2539-44;Choi-Sledeski et al. J. Med. Chem. 1999, 42, 3572-87; Ewing et al. J.Med. Chem. 1999, 42, 3557-71; Bostwick et al. Thromb Haemost 1999, 81,157-60). Although many agents including heparin, hirudin, and hiruloghave been developed to control the production of thrombin, these agentsgenerally have the disadvantage of requiring intravenous or subcutaneousinjection several times a day in addition to possible side effects, andthe search for synthetic small molecule factor Xa inhibitors remains thesubject of great research effort.

Among factor Xa inhibitors with known binding affinities are a series oftripeptides ending with arginine aldehyde (Marlowe et al. Bioorg. Med.Chem. Lett. 2000, 10, 13-16) that are easily be included in theDNA-templated non-natural peptide library described above. Depending onthe identities of the first two residues, these tripeptides exhibit IC₅₀values ranging from 15 nM to 60 μM (Marlowe et al. Bioorg. Med. Chem.Lett. 2000, 10, 13-16) and therefore provide ideal positive controls forvalidating and calibrating an in vitro selection for synthetic factor Xaligands (see below). Both factor Xa and active factor Xa immobilized onresin are commercially available (Protein Engineering Technologies,Denmark). The resin-bound factor Xa is used to select members of boththe DNA-templated non-natural peptide and bicyclic libraries with factorXa affinity in a manner analogous to the integrin receptor bindingselections described above.

Following PCR amplification of DNA templates encoding selected syntheticmolecules, additional rounds of translation, selection, andamplification are conducted to enrich the library for the highestaffinity binders. The stringency of the selection is gradually increasedby increasing the salt concentration of the binding and washing buffers,decreasing the duration of binding, elevating the binding and washingtemperatures, and increasing the concentration of washing additives suchas template DNA or unrelated proteins. Importantly, in vitro selectionscan also select for specificity in addition to binding affinity. Toeliminate those molecules that possess undesired binding properties,library members bound to immobilized α_(V)β₃ integrin or factor Xa arewashed with non-target proteins such as other integrins or other serineproteases, leaving only those molecules that bind the target protein butdo not bind non-target proteins.

Iterated cycles of translation, selection, and amplification results inlibrary enrichment rather than library evolution, which requiresdiversification between rounds of selection. Diversification of thesesynthetic libraries are achieved in at least two ways, both analogous tomethods used by Nature to diversify proteins. Random point mutagenesisis performed by conducting the PCR amplification step under error-pronePCR (Caldwell et al. PCR Methods Applic. 1992, 2, 28-33) conditions.Because the genetic code of these molecules are written to assignrelated codons to related chemical groups, similar to the way that thenatural protein genetic code is constructed, random point mutations inthe templates encoding selected molecules will diversify progeny towardschemically related analogs. In addition to point mutagenesis, syntheticlibraries generated in this approach are also diversified usingrecombination. Templates to be recombined have the structure shown inFIG. 45, in which codons are separated by five-base non-palindromicrestriction endonuclease cleavage sites such as those cleaved by Avail(G/GWCC, W=A or T), Sau96I (G/GNCC, N=A, G, T, or C), DdeI (C/TNAG), orHinFI (G/ANTC). Following selections, templates encoding desiredmolecules are enzymatically digested with these commercially availablerestriction enzymes. The digested fragments are then recombined intointact templates with T4 DNA ligase. Because the restriction sitesseparating codons are nonpalindromic, templates fragments can onlyreassemble to form intact recombined templates (FIG. 45). DNA-templatedtranslation of recombined templates provides recombined small molecules.In this way, functional groups between synthetic small molecules withdesired activities are recombined in a manner analogous to therecombination of amino acid residues between proteins in Nature. It iswell appreciated that recombination explores the sequence space of amolecule much more efficiently than point mutagenesis alone (Minshull etal. Curr. Opin. Chem. Biol. 1999, 3, 284-90; Bogarad et al. Proc. Natl.Acad. Sci. USA 1999, 96, 2591-5; Stemmer, W. Nature 1994, 370, 389-391).

Small molecule evolution using mutation and recombination offers twopotential advantages over simple enrichment. If the total diversity ofthe library is much less than the number of molecules made (typically10¹² to 10¹⁵), every possible library member is present at the start ofthe selection. In this case, diversification is still useful becauseselection conditions almost always change as rounds of evolutionprogress. For example, later rounds of selection will likely beconducted under higher stringencies, and may involve counter selectionsagainst binding non-target proteins. Diversification gives librarymembers that have been discarded during earlier rounds of selection thechance to reappear in later rounds under altered selection conditions inwhich their fitness relative to other members may be greater. Inaddition, it is quite possible using this approach to generate asynthetic library that has a theoretical diversity greater than 10¹⁵molecules. In this case, diversification allows molecules that neverexisted in the original library to emerge in later rounds of selectionson the basis of their similarity to selected molecules, similar to theway in which protein evolution searches the vastness of protein sequencespace one small subset at a time.

Example 8

Characterization of Evolved Compounds: Following multiple rounds ofselection, amplification, diversification, and translation, moleculessurviving the selection will be characterized for their ability to bindthe target protein. To identify the DNA sequences encoding evolvedsynthetic molecules surviving the selection, PCR-amplified templates arecloned into vectors, transformed into cells, and sequenced as individualclones. DNA sequencing of these subcloned templates reveal the identityof the synthetic molecules surviving the selection. To gain generalinformation about the functional groups being selected during rounds ofevolution, populations of templates are sequenced in pools to reveal thedistribution of A, G, T, and C at every codon position. The judiciousdesign of each functional group's genetic code allows considerableinformation to be gathered from population sequencing. For example, a Gat the first position of a codon may designate a charged group, while aC at this position may encode a hydrophobic substituent.

To validate the integrin binding selection and to compare selectedlibrary members with known α_(V)β₃ integrin ligands, linear GRGDSPK anda cyclic RGDfV analog (cyclic iso-ERGDfV) are also included in theDNA-templated cyclic peptide library. The selection conditions areadjusted until verification that libraries containing these knownintegrin ligands undergo enrichment of the DNA templates encoding theknown ligands upon selection for integrin binding. In addition, thedegree of enrichment of template sequences encoding these known α_(V)β₃integrin ligands is correlated with their known affinities and with theenrichment and affinity of newly discovered α_(V)β₃ integrin ligands.

Once the enrichment of template sequences encoding known and newintegrin ligands is confirmed, novel evolved ligands will be synthesizedby non-DNA templated synthesis and assayed for their α_(V)β₃ integrinreceptor antagonist activity and specificity. Standard in vitro bindingassays to integrin receptors (Dechantsreiter et al. J. Med. Chem. 1999,42, 3033-40) are performed by competing the binding of biotinylatedfibrinogen (a natural integrin ligand) to immobilized integrin receptorwith the ligand to be assayed. The inhibition of binding to fibrinogenis quantitated by incubation with an alkaline phosphatase-conjugatedanti-biotin antibody and a chromogenic alkaline phosphate substrate.Comparison of the binding affinities of randomly chosen library membersbefore and after selection will validate the evolution of the librarytowards target binding. Assays for binding non-target proteins revealthe ability of these libraries to be evolved towards binding specificityin addition to binding affinity.

Similarly, the selection for factor Xa binding is validated by includingthe known factor Xa tripeptide inhibitors in the library design andverifying that a round of factor Xa binding selection and PCRamplification results in the enrichment of their associated DNAtemplates. Synthetic library members evolved to bind factor Xa areassayed in vitro for their ability to inhibit factor Xa activity. FactorXa inhibition can be readily assayed spectrophotometrically using thecommercially available chromogenic substrate S-2765 (Chromogenix,Italy).

While the DNA sequence alone of a non-natural peptide library member islikely to reveal the exact identity of the corresponding peptide, thefinal step in the bicyclic library synthesis is a non-DNA-templatedintramolecular 1,3-dipolar cycloaddition that may yield diastereomericpairs of regioisomers. Although modeling strongly suggests that only theregioisomer shown in FIG. 38 can form for steric reasons, facialselectivity is less certain. Diastereomeric purity is not a requirementfor the in vitro selections described above since each molecule isselected on a single molecule basis. Nevertheless, it may be useful tocharacterize the diastereoselectivity of the dipolar cycloaddition. Toaccomplish this, non-DNA-templated synthesis of selected bicycliclibrary members is performed, diastereomers are separated by chiralpreparative HPLC, and product stereochemistry by nOe or X-raydiffraction is determined.

Example 9

Translating DNA into Non-Natural Polymers Using DNA Polymerases: Analternative approach to translating DNA into non-natural, evolvablepolymers takes advantage of the ability of some DNA polymerases toaccept certain modified nucleotide triphosphate substrates (D. M. Perrinet al. J. Am. Chem. Soc. 2001, 123, 1556; D. M. Perrin et al.Nucleosides Nucleotides 1999, 18, 377-91; T. Gourlain et al. NucleicAcids Res. 2001, 29, 1898-1905; S. E. Lee et al. Nucleic Acids Res.2001, 29, 1565-73; K. Sakthievel et al. Angew. Chem. Int. Ed. 1998, 37,2872-2875). Several deoxyribonucleotides (FIG. 45) and ribonucleotidesbearing modifications to groups that do not participate in Watson-Crickhydrogen bonding are known to be inserted with high sequence fidelityopposite natural DNA templates. Importantly, single-stranded DNAcontaining modified nucleotides can serve as efficient templates for theDNA-polymerase-catalyzed incorporation of natural or modifiedmononucleotides. In one of the earliest examples of modified nucleotideincorporation by DNA polymerase, Toole and co-workers reported theacceptance of 5-(1-pentynyl)-deoxyuridine 1 by Vent DNA polymerase underPCR conditions (J. A. Latham et al. Nucleic Acids Res. 1994, 22,2817-22). Several additional 5-functionalized deoxyuridines (2-7)derivatives were subsequently found to be accepted by thermostable DNApolymerases suitable for PCR (K. Sakthievel et al. Angew. Chem. Int. Ed.1998, 37, 2872-2875). The first functionalized purine accepted by DNApolymerase, deoxyadenosine analog 8, was incorporated into DNA by T7 DNApolymerase together with deoxyuridine analog 7 (D. M. Perrin et al.Nucleosides Nucleotides 1999, 18, 377-91). DNA libraries containing both7 and 8 were successfully selected for metal-independent RNA cleavingactivity (D. M. Perrin et al. J. Am. Chem. Soc. 2001, 123, 1556-63).Williams and co-workers recently tested several deoxyuridine derivativesfor acceptance by Taq DNA polymerases and concluded that acceptance isgreatest when using C5-modified uridines bearing rigid alkyne ortrans-alkene groups such as 9 and 10 (S. E. Lee et al. Nucleic AcidsRes. 2001, 29, 1565-73). A similar study (T. Gourlain et al. NucleicAcids Res. 2001, 29, 1898-1905) on C7-functionalized7-deaza-deoxyadenosines revealed acceptance by Taq DNA polymerase of7-ammopropyl- (11), cis-7-aminopropenyl- (12), and7-aminopropynyl-7-deazadeoxyadenosine (13).

The functionalized nucleotides incorporated by DNA polymerases to date,shown in FIG. 46, have focused on adding “protein-like” acidic and basicfunctionality to DNA. While equipping nucleic acids with general acidand general base functionality such as primary amine and carboxylategroups may increase the capability of nucleic acid catalysts, thefunctional groups present in natural nucleic acid bases already havedemonstrated the ability to serve as general acids and bases. Thehepatitis delta ribozyme, for example, is thought to use thepK_(a)-modulated endocyclic amine of cytosine 75 as a general acid (S.Nakano et al. Science 2000, 287, 1493-7) and the peptidyl transferaseactivity of the ribosome may similarly rely on general base or generalacid catalysis (G. W. Muth et al. Science 2000, 289, 947-50; P. Nissenet al. Science 2000, 289, 920-930; N. Ban et al. Science 2000, 289,905-920) although the latter case remains the subject of ongoing debate(N. Polacek et al. Nature 2001, 411, 498-501). Equipping DNA bases withadditional Brønsted acidic and basic groups, therefore, may notprofoundly expand the scope of DNA catalysis.

In contrast with simple general acid and general base functionality,chiral metal centers would expand considerably the chemical scope ofnucleic acids. Functionality aimed at binding chemically potent metalcenters has yet to been incorporated into nucleic acid polymers. NaturalDNA has demonstrated the ability to fold in complex three-dimensionalstructures capable of stereospecifically binding target molecules (C. H.Lin et al. Chem. Biol. 1997, 4, 817-32; C. H. Lin et al. Chem. Biol.1998, 5, 555-72; P. Schultze et al. J. Mol. Biol. 1994, 235, 1532-47) orcatalyzing phosphodiester bond manipulation (S. W. Santoro et al. Proc.Natl. Acad. Sci. USA 1997, 94, 4262-6; R. R. Breaker et al. Chem. Biol.1995, 2, 655-60; Y. Li et al. Biochemistry 2000, 39, 3106-14; Y. Li etal. Proc. Natl. Acad. Sci, USA 1999, 96, 2746-51), DNA depurination (T.L. Sheppard et al. Proc. Natl. Acad. Sci. USA 2000, 97, 7802-7807) andporphyrin metallation (Y. Li et al. Biochemistry 1997, 36, 5589-99; Y.Li et al. Nat. Struct. Biol. 1996, 3, 743-7). Non-natural nucleic acidsaugmented with the ability to bind chemically potent, water-compatiblemetals such Cu, La, Ni, Pd, Rh, Ru, or Sc may possess greatly expandedcatalytic properties. For example, a Pd-binding oligonucleotide foldedinto a well-defined structure may possess the ability to catalyzePd-mediated coupling reactions with a high degree of regiospecificity orstereospecificity. Similarly, non-natural nucleic acids that form chiralSc binding sites may serve as enantioselective cycloaddition or aldoladdition catalysts. The ability of DNA polymerases to translate DNAsequences into these non-natural polymers coupled with in vitroselections for catalytic activities would therefore enable the directevolution of desired catalysts from random libraries.

Evolving catalysts in this approach addresses the difficulty ofrationally designing catalytic active sites with specific chemicalproperties that has inspired recent combinatorial approaches (K. W.Kuntz et al. Curr. Opin. Chem. Biol. 1999, 3, 313-319; M. B. Francis etal. Curr. Opin. Chem. Biol. 1998, 2, 422-8) to organometallic catalystdiscovery. For example, Hoveyda and co-workers identified Ti-basedenantioselective epoxidation catalysts by serial screening of peptideligands (K. D. Shimizu et al. Angew. Chem. Int. Ed. 1997, 36) Serialscreening was also used by Jacobsen and co-workers to identify peptideligands that form enantioselective epoxidation catalysts when complexedwith metal cations (M. B. Francis et al. Angew. Chem. Int. Ed. Engl.1999, 38, 937-941) Recently, a peptide library containing phosphine sidechains was screened for the ability to catalyze malonate ester additionto cyclopentenyl acetate in the presence of Pd (S. R. Gilbertson et al.J. Am. Chem. Soc. 2000, 122, 6522-6523). The current approach differsfundamentally from previous combinatorial catalyst discovery efforts,however, in that it enables catalysts with desired properties tospontaneously emerge from one pot, solution-phase libraries afterevolutionary cycles of diversification, amplification, translation, andselection. This strategy allows up to 10¹⁵ different catalysts to begenerated and selected for desired properties in a single experiment.The compatibility of our approach with one-pot in vitro selectionsallows the direct selection for reaction catalysis rather than screeningfor a phenomenon associated with catalysis such as metal binding or heatgeneration. In addition, properties difficult to screen rapidly such assubstrate stereospecificity or metal selectivity can be directlyselected using our approach (see below).

Key intermediates for a number of C5-functionalized uridine analogs andC7-functionalized 7-deazaadenosine analogs have been synthesized forincorporation into non-natural DNA polymers. In addition, the synthesisof six C8-functionalized adenosine analogs as deoxyribonucleotidetriphosphates has been completed. Because only limited informationexists on the ability of DNA polymerases to accept modified nucleotides,we chose to synthesize analogs were synthesized that not only will bringmetal-binding functionality to nucleic acids but that also will provideinsights into the determinants of DNA polymerase acceptance.

The strategy for the synthesis of metal-binding uridine and7-deazaadenosine analogs is shown in FIG. 47. Both routes end with amidebond formation between NHS esters of metal-binding functional groups andamino modified deoxyribonucleotide triphosphates (7 and 13). Analogs 7and 13 as well as acetylated derivatives of 7 have been previously shown(D. M. Perrin et al. J. Am. Chem. Soc. 2001, 123, 1556-63; D. M. Perrinet al. Nucleosides Nucleotides 1999, 18, 377-91; J. A. Latham et al.Nucleic Acids Res. 1994, 22, 2817-22; T. Gourlain et al. Nucleic AcidsRes. 2001, 29, 1898-1905; S. E. Lee et al. Nucleic Acids Res. 2001, 29,1565-73; K. Sakthivel et al. Angew. Chem. Int. Ed. Engl. 1998, 37,2872-2875) to be tolerated by DNA polymerases, including thermostableDNA polymerases suitable for PCR. This convergent approach allows a widevariety of metal-binding ligands to be rapidly incorporated into eithernucleotide analog. The synthesis of 7 has been completed following apreviously reported (K. Sakthivel et al. Angew. Chem. Int. Ed. Engl.1998, 37, 2872-2875) route (FIG. 48, Phillips, Chorba, Liu, unpublishedresults). Heck coupling of commercially available 5-iodo-2′-deoxyuridine(22) with N-allyltrifluoroacetamide provided 23. The 5′-triphosphategroup was installed by treatment of 23 with trimethylphosphate, POCl₃,and proton sponge (1,8-bis(dimethylamino)-naphthalene) followed bytri-n-butylammonium pyrophosphate, and the trifluoroacetamide group thenremoved with aqueous ammonia to afford 7.

Several steps towards the synthesis of 13 have been completed, the keyintermediate for 7-deazaadenosine analogs (FIG. 49). Following a knownroute (J. Davoll. J. Am. Chem. Soc. 1960, 82, 131-138)diethoxyethylcyanoacetate (24) was synthesized from bromoacetal 25 andethyl cyanoacetate (26). Condensation of 24 with thiourea providedpyrimidine 27, which was desulfurized with Raney nickel and thencyclized to pyrrolopyrimidine 28 with dilute aqueous HCl. Treatment of28 with POCl₃ afforded 4-chloro-7-deazaadenine (29). The aryl iodidegroup which will serve as a Sonogashira coupling partner forinstallation of the propargylic amine in 13 was installed by reacting 29with N-iodosuccinimide to generate 4-chloro-7-iodo-7-deazaadenine (30)in 13% overall yield from bromoacetal 25.

As alternative functionalized adenine analogs that will both probe thestructural requirements of DNA polymerase acceptance and providepotential metal-binding functionality, six 8-modified deoxyadenosinetriphosphates (FIG. 50) have been synthesized. All functional groupswere installed by addition to 8-bromo-deoxyadenosine (31), which wasprepared by bromination of deoxyadenosine in the presence of ScCl₃,which we found to greatly increase product yield. Methyl- (32), ethyl-(33), and vinyladenosine (34) were synthesized by Pd-mediated Stillecoupling of the corresponding alkyl tin reagent and 31 (P. Mamos et al.Tetrahedron Lett. 1992, 33, 2413-2416). Methylamino- (35) (E. Nandananet al. J. Med. Chem. 1999, 42, 1625-1638), ethylamino- (36), andhistaminoadenosine (37) were prepared by treatment of 23 with thecorresponding amine in water or ethanol. The 5′-nucleotide triphosphatesof 32-37 were synthesized as described above.

The ability of thermostable DNA polymerases suitable for PCRamplification to accept these modified nucleotide triphosphatescontaining metal-binding functionality. Non-natural nucleotidetriphosphates were purified by ion exchange HPLC and added to PCRreactions containing Taq DNA polymerase, three natural deoxynucleotidetriphosphates, pUC19 template DNA, and two DNA primers. Primers werechosen to generate PCR products ranging from 50 to 200 base pairs inlength. Control PCR reactions contained the four natural deoxynucleotidetriphosphates and no non-natural nucleotides. PCR reactions wereanalyzed by agarose or denaturing acrylamide gel electrophoresis. Aminomodified uridine derivative 7 was efficiently incorporated by Taq DNApolymerase over 30 PCR cycles, while the triphosphate of 23 was not anefficient polymerase substrate (FIG. 51). Previous findings on theacceptance of 7 by Taq DNA polymerase are in conflict, with bothnon-acceptance (K. Sakthivel et al. Angew. Chem. Int. Ed. Engl. 1998,37, 2872-2875) and acceptance (S. E. Lee et al. Nucleic Acids Res. 2001,29, 1565-73) reported.

Non-Natural Metal-Binding Deoxyribonucleotide Triphosphate Synthesis:The syntheses of the C5-functionalized uridine, C7-functionalized7-deazaadenosine, and C8-functionalized adenosine deoxynucleotidetriphosphates will be completed. Synthesis of the 7-deazaadenosinederivatives from 4-chloro-7-iodo-deazaadenine (30) proceeds byglycosylation of 30 with protected deoxyribosyl chloride 38 followed byammonolysis to afford 7-iodo-adenosine (39) (FIG. 31) (Gourlain et al.Nucleic Acids Res. 2001, 29, 1898-1905). Protected deoxyribosyl chloride38 can be generated from deoxyribose as shown in FIG. 52. Pd-mediatedSonogashira coupling (Seela et al. Helv. Chem. Acta 1999, 82, 1878-1898)of 39 with N-propynyltrifluoroacetamide provides 40, which is then beconverted to the 5′ nucleotide triphosphate and deprotected with ammoniaas described above to yield 13.

To generate rapidly a collection of metal-binding uridine and adenosineanalogs, a variety of metal-binding groups as NHS esters will be coupledto C5-modified uridine intermediate 7 (already synthesized) andC7-modified 7-deazaadenosine intermediate 13. Metal-binding groups thatwill be examined initially are shown in FIG. 47 and include phosphines,thiopyridyl groups, and hemi-salen moieties. If our initial polymeraseacceptance assays (see the following section) of triphosphates of8-modified adenosines 32-37 (FIG. 50) suggest that a variety of8-modified adenosine analogs are accepted by thermostable polymerases,alkyl- and vinyl trifluoroacetamides will be coupled to8-bromo-deoxyadenosine (31) to generate nucleotide triphosphates such as41 and 42 (FIG. 53). These intermediates are then coupled with the NHSesters shown in FIG. 46 to generate a variety of metal-binding8-functionalized deoxyadenosine triphosphates.

Evaluating Non-Natural Nucleotides: Each functionalizeddeoxyribonucleotide triphosphate is then assayed for its suitability asa building block of an evolvable non-natural polymer library in twostages. First, simple acceptance by thermostable DNA polymerases ismeasured by PCR amplification of fragments of DNA plasmid pUC19 ofvarying length. PCR reactions contain synthetic primers designed to bindat the ends of the fragment, a small quantity of pUC19 template DNA, athermostable DNA polymerase (Taq, Pfu or Vent), three naturaldeoxyribonucleotide triphosphates, and the non-natural nucleotidetriphosphate to be tested. The completely successful incorporation ofthe non-natural nucleotide results in the production of DNA products ofany length at a rate similar to that of the control reaction. Thosenucleotides that allow at least incorporation of 10 or more non-naturalnucleotides in a single product molecule with at least modest efficiencyare subjected to the second stage of evaluation.

Non-natural nucleotides accepted by thermostable DNA polymerases areevaluated for their possible mutagenic properties. If DNA polymerasesinsert a non-natural nucleotide opposite an incorrect (non-Watson-Crick)template base, or insert an incorrect natural nucleotide opposite anon-natural nucleotide in the template, the fidelity of libraryamplification and translation is compromised. To evaluate thispossibility, PCR products generated in the above assay are subjected toDNA sequencing using each of the PCR primers. Deviations from thesequence of the pUC19 template imply that one or both of the mutagenicmechanisms are taking place. Error rates of less than 0.7% per base per30 PCR cycles are acceptable, as error-prone PCR generates errors atapproximately this rate (Caldwell et al. PCT Methods Applic. 1992, 2,28-33) yet has been successfully used to evolve nucleic acid libraries.

Pairs of promising non-natural adenosine analogs and non-natural uridineanalogs are also tested together for their ability to support DNApolymerization in a PCR reaction containing both modified nucleotidetriphosphates together with dGTP and dCTP. Successful PCR productformation with two non-natural nucleotide triphosphates enables theincorporation of two non-natural metal-binding bases into the samepolymer molecule. Functionalized nucleotides that are especiallyinteresting yet are not compatible with Taq, Pfu, or Vent thermostableDNA polymerases can still be used in the libraries provided that theyare accepted by a commercially available DNA polymerase such as theKlenow fragment of E. coli DNA polymerase I, T7 DNA polymerase, T4 DNApolymerase, or M-MuLV reverse transcriptase. In this case, the assaysrequire conducting the primer extension step of the PCR reaction at25-37° C., and fresh polymerase must be added at every cycle followingthe 94° C. denaturation step. DNA sequencing to evaluate the possiblemutagenic properties of the non-natural nucleotide is still performed asdescribed above

Generating Libraries of Metal-Binding Polymers: Based on the results ofthe above non-natural nucleotide assays, several libraries of ˜10¹⁵different nucleic acid sequences will be made containing one or two ofthe most polymerase compatible and chemically promising non-naturalmetal-binding nucleotides. Libraries are generated by PCR amplificationof a synthetic DNA template library consisting of a random region of 20or 40 nucleotides flanked by two 15-base constant priming regions (FIG.54). The priming regions contain restriction endonuclease cleavage sitesto allow cloning into vectors for DNA sequencing of pools or individuallibrary members. One primer contains a chemical handle such as a primaryamine group or a thiol group at its 5′ terminus and becomes the codingstrand of the library. The other primer contains a biotinylated T at its5′ terminus and becomes the non-coding strand. The PCR reaction includesone or two non-natural metal-binding deoxyribonucleotide triphosphates,three or two natural deoxyribonucleotide triphosphates, and a DNApolymerase compatible with the non-natural nucleotide(s). Following PCRreaction to generate the double-stranded form of the library and gelpurification to remove unused primers, library member duplexes aredenatured chemically. The non-coding strands are the removed by severalwashings with streptavidin-linked magnetic beads to ensure that nobiotinylated strands remain in the library. Libraries of up to 10¹⁵different members are generated by this method, far exceeding thecombined diversity of previous combinatorial catalyst efforts.

Each library is then incubated in aqueous solution with a metal ofinterest from the following non-limiting list of water compatible metalsalts (Fringueli et al. Eur. J. Org. Chem. 2001, 2001, 439-455; Zaitounet al. J. Phys. Chem. B 1997, 1857-1860): ScCl₃, CrCl₃, MnCl₂, FeCl₂,FeCl₃, CoCl₂, NiCl₂, CuCl₂, ZnCl₂, GaCl₃, YCl₃, RuCl₃, RhCl₃, PdCl₂,AgCl, CdCl₂, InCl₃, SnCl₂, La(OTf)₃, Ce(OTf)₃, Pr(OTf)₃, Nd(OTf)₃,Sm(OTf)₃, Eu(OTf)₃, Gd(OTf)₃, Tb(OTf)₃, Dy(OTf)₃, Ho(OTf)₃, Er(OTf)₃,Tm(OTf)₃, Yb(OTf)₃, Lu(OTf)₃, IrCl₃, PtCl₂, AuCl, HgCl₂, HgCl, PbCl₂, orBiCl₃. Metals are chosen based on the specific chemical reactions to becatalyzed. For example, libraries aimed at reactions such as aldolcondensations or hetero Diels-Alder reactions that are known (Fringuelliet al. Eur. J. Org. Chem. 2001, 2001, 439-455) to be catalyzed by Lewisacids are incubated with ScCl₃ or with one of the lanthanide triflates,while those aimed at coupling electron-deficient olefins with arylhalides are incubated with PdCl₂. The metalated library is then purifiedaway from unbound metal salts by gel filtration using sephadex oracrylamide cartridges, which separate DNA oligonucleotides 25 bases orlonger from unbound small molecule components.

The ability of the polymer library (or of individual library members) tobind metals of interest is verified by treating the metalated libraryfree of unbound metals with metal staining reagents such asdithiooxamide, dimethylglyoxime, KSCN (Francis et al. Curr. Opin. Chem.Biol. 1998, 2, 422-8) or EDTA (Zaitoun et al. J. Phys. Chem. B 1997,101, 1857-1860) that become distinctly colored in the presence ofdifferent metals. The approximate level of metal binding is measured byspectrophotometric comparison with solutions of free metals of knownconcentration and with solutions of positive control oligonucleotidescontaining an EDTA group (which can be introduced using a commerciallyavailable phosphoramidite from Glen Research).

In Vitro Selections for Non-Natural Polymer Catalysts: Metalatedlibraries of evolvable non-natural polymers containing metal-bindinggroups are then subjected to one-pot, solution-phase selections forcatalytic activities of interest. Library members that catalyzevirtually any reaction that causes bond formation between two substratemolecules or that results in bond breakage into two product moleculesare selected using the schemes proposed in FIGS. 54 and 55. To selectfor bond forming catalysts (for example, hetero Diels-Alder, Heckcoupling, aldol reaction, or olefin metathesis catalysts), librarymembers are covalently linked to one substrate through their 5′ amino orthiol termini. The other substrate of the reaction is synthesized as aderivative linked to biotin. When dilute solutions of library-substrateconjugate are reacted with the substrate-biotin conjugate, those librarymembers that catalyze bond formation cause the biotin group to becomecovalently attached to themselves. Active bond forming catalysts canthen be separated from inactive library members by capturing the formerwith immobilized streptavidin and washing away inactive polymers (FIG.55).

In an analogous manner, library members that catalyze bond cleavagereactions such as retro-aldol reactions, amide hydrolysis, eliminationreactions, or olefin dihydroxylation followed by periodate cleavage canalso be selected. In this case, metalated library members are covalentlylinked to biotinylated substrates such that the bond breakage reactioncauses the disconnection of the biotin moiety from the library members(FIG. 56). Upon incubation under reaction conditions, active catalysts,but not inactive library members, induce the loss of their biotingroups. Streptavidin-linked beads can then be used to capture inactivepolymers, while active catalysts are able to elute from the beads.Related bond formation and bond cleavage selections have been usedsuccessfully in catalytic RNA and DNA evolution (Jäschke et al. Curr.Opin. Chem. Biol. 2000, 4, 257-62) Although these selections do notexplicitly select for multiple turnover catalysis, RNAs and DNAsselected in this manner have in general proven to be multiple turnovercatalysts when separated from their substrate moieties (Jäschke et al.Curr. Opin. Chem. Biol. 2000, 4, 257-62; Jaeger et al. Proc. Natl. Acad.Sci. USA 1999, 96, 14712-7; Bartel et al. Science, 1993, 261, 1411-8;Sen et al. Curr. Opin. Chem. Biol. 1998, 2, 680-7).

Catalysts of three important and diverse bond-forming reactions willinitially be evolved: Heck coupling, hetero Diels-Alder cycloaddition,and aldol addition. All three reactions are water compatible (Kobayashiet al. J. Am. Chem. Soc. 1998, 120, 8287-8288; Fringuelli et al. Eur. J.Org. Chem. 2001, 2001, 439-455; Li et al. Organic Reactions in AqueousMedia. Wiley and Sons: New York, 1997) and are known to be catalyzed bymetals. As Heck coupling substrates both electron deficient andunactivated olefins will be used together with aryl iodides and arylchlorides. Heck reactions with aryl chlorides in aqueous solution, aswell as room temperature Heck reactions with non-activated arylchlorides, have not yet been reported to our knowledge. Libraries forHeck coupling catalyst evolution use PdCl₂ as a metal source. HeteroDiels-Alder substrates include simple dienes and aldehydes, while aldoladdition substrates consist of aldehydes and both silyl enol ethers aswell as simple ketones. Representative selection schemes for Heckcoupling, hetero Diels-Alder, and aldol addition catalysts are shown inFIG. 57. The stringency of these selections can be increased betweenrounds of selection by decreasing reaction times, lowering reactiontemperatures, or using less activated substrates (for example, lesselectron poor aryl chlorides (Littke et al. J. Am. Chem. Soc. 2001, 123,6989-7000) or simple ketones instead of silyl enol ethers).

Evolving Non-Natural Polymers: Diversification and Selecting forStereospecificity

Following each round of selection, active library members are amplifiedby PCR with the non-natural nucleotides and subjected to additionalrounds of selection to enrich the library for desired catalysts. Theselibraries are truly evolved by introducing a diversification step beforeeach round of selection. Libraries are diversified by random mutagenesisusing error-prone PCR (Caldwell et al. PCR Methods Applic. 1992, 2,28-33) or by recombination using modified DNA shuffling methods thatrecombine small, non-homologous nucleic acid fragments. Becauseerror-prone PCR is inherently less efficient than normal PCR,error-prone PCR diversification will be conducted with only naturaldATP, dTTP, dCTP, and dGTP and using primers that lack chemical handlesor biotin groups. The resulting mutagenized products are then subjectedto PCR translation into non-natural nucleic acid polymers using standardPCR reactions containing the non-natural nucleotide(s), the biotinylatedprimer, and the amino- or thiol-terminated primer.

In addition to simply evolving active catalysts, the in vitro selectionsdescribed above are used to evolve non-natural polymer libraries inpowerful directions difficult to achieve using other catalyst discoveryapproaches. An enabling feature of these selections is the ability toselect either for library members that are biotinylated or for membersthat are not biotinylated. Substrate specificity among catalysts cantherefore be evolved by selecting for active catalysts in the presenceof the desired substrate and then selecting in the same pot for inactivecatalysts in the presence of one or more undesired substrates. If thedesired and undesired substrates differ by the configuration at one ormore stereocenters, enantioselective or diastereoselective catalysts canemerge from rounds of selection. Similarly, metal selectivity can beevolved by selecting for active catalysts in the presence of desiredmetals and selecting for inactive catalysts in the presence of undesiredmetals. Conversely, catalysts with broad substrate tolerance can beevolved by varying substrate structures between successive rounds ofselection.

Finally, the observations of sequence-specific DNA-templated synthesisin DMF and CH₂Cl₂ suggests that DNA-tetraalkylammonium cation complexescan form base-paired structures in organic solvents. This finding raisesthe possibility of evolving our non-natural nucleic acid catalysts inorganic solvents using slightly modified versions of the selectionsdescribed above. The actual bond forming and bond cleavage selectionreactions will be conducted in organic solvents, the crude reactionswill be ethanol precipitated to remove the tetraalkylammonium cations,and the immobilized avidin separation of biotinylated andnon-biotinylated library members in aqueous solution will be performed.PCR amplification of selected members will then take place as describedabove. The successful evolution of reaction catalysts that function inorganic solvents would expand considerably both the scope of reactionsthat can be catalyzed and the utility of the resulting evolvednon-natural polymer catalysts.

Characterizing Evolved Non-Natural Polymers: Libraries subjected toseveral rounds of evolution are characterized for their ability tocatalyze the reactions of interest both as pools of mixed sequences oras individual library members. Individual members are extricated fromevolved pools by ligating PCR amplified sequences into DNA vectors,transforming dilute solutions of ligated vectors into competentbacterial cells, and picking single colonies of transformants. Assays onpools or individual sequences are conducted both in the single turnoverformat and in a true multiple turnover catalytic format. For the singleturnover assays, the rate at which substrate-linked bond formationcatalysts effect their own biotinylation in the presence of freebiotinylated substrate will be measured, or the rate at whichbiotinylated bond breakage catalysts effect the loss of their biotingroups. Multiple turnover assays are conducted by incubating evolvedcatalysts with small molecule versions of substrates and analyzing therate of product formation by tlc, NMR, mass spectrometry, HPLC, orspectrophotometry.

Once multiple turnover catalysts are evolved and verified by thesemethods, detailed mechanistic studies can be conducted on the catalysts.The DNA sequences corresponding to the catalysts are revealed bysequencing PCR products or DNA vectors containing the templates ofactive catalysts. Metal preferences are evaluated by metalatingcatalysts with a wide variety of metal cations and measuring theresulting changes in activity. The substrate specificity andstereoselectivity of these catalysts are assessed by measuring the ratesof turnover of a series of substrate analogs. Diastereoselectivities andenantioselectivities of product formation are revealed by comparingreaction products with those of known stereochemistry. Previous studiessuggest that active sites buried within large chiral environments oftenpossess high degrees of stereoselectivity. For example, peptide-basedcatalysts generated in combinatorial approaches have demonstrated poorto excellent stereoselectivities that correlate with the size of thepeptide ligand (Jarvo et al. J. Am. Chem. Soc. 1999, 121, 11638-11643)while RNA-based catalysts and antibody-based catalysts frequentlydemonstrate excellent stereoselectivities (Jäschke et al. Curr. Opin.Chem. Biol. 2000, 4, 257-262; Seelig et al. Angew. Chem. Int. Ed. Engl.2000, 39, 4576-4579; Hilvert, D. Annu. Rev. Biochem. 2000, 69, 751-93;Barbas et al. Science 1997, 278, 2085-92; Zhong et al. Angew. Chem. Mt.Ed. Engl. 1999, 38, 3738-3741; Zhong et al. J. Am. Chem. Soc. 1997, 119,8131-8132; List et al. Org. Lett. 1999, 1, 59-61) The direct selectionsfor substrate stereoselectivity described above should further enhancethis property among evolved catalysts.

Structure-function studies on evolved catalysts are greatly facilitatedby the ease of automated DNA synthesis. Site-specific structuralmodifications are introduced by synthesizing DNA sequences correspondingto “mutated” catalysts in which bases of interest are changed to otherbases. Changing the non-natural bases in a catalyst to a natural base(U* to C or A* to G) and assaying the resulting mutants may identify thechemically important metal-binding sites in each catalyst. The minimalpolymer required for efficient catalysis are determined by synthesizingand assaying progressively truncated versions of active catalysts.Finally, the three-dimensional structures of the most interestingevolved catalysts complexed with metals are solved in collaboration withlocal macromolecular NMR spectroscopists or X-ray crystallographers.

1. A method of synthesizing one or more chemical compounds, the methodcomprising the steps of: providing one or more templates, which one ormore templates optionally have a reactive unit associated therewith;contacting one or more transfer units having an anti-codon and reactiveunit with said one or more templates under conditions to allow forhybridization of the one or more anti-codons to the template, andreaction of the reactive units. 2.-46. (canceled)
 47. A library ofchemical compounds, the library comprising oligonucleotide templatescovalently attached to macrocyclic peptides comprising a macrocyclicring comprising four amino acids and three peptide bonds, wherein theamino acids optionally comprise a non-proteinogenic side chain and themacrocyclic ring is cyclized through a side chain of one of the aminoacids, and wherein each oligonucleotide has a nucleotide sequenceinformative of the synthetic history of the macrocyclic peptidecovalently attached therewith.
 48. The library of claim 47, wherein eachmacrocyclic ring comprises 14 ring atoms.